Adv in Health Sci Educ DOI 10.1007/s10459-013-9480-6

Validating MMI scores: are we measuring multiple attributes? Tom Oliver • Kent Hecker • Peter A. Hausdorf • Peter Conlon

Received: 8 April 2013 / Accepted: 17 November 2013 Ó Springer Science+Business Media Dordrecht 2014

Abstract The multiple mini-interview (MMI) used in health professional schools’ admission processes is reported to assess multiple non-cognitive constructs such as ethical reasoning, oral communication, or problem evaluation. Though validation studies have been performed with total MMI scores, there is a paucity of information regarding how well MMI scores differentiate the constructs being measured, the relationship between MMI scores (construct or total) and personality characteristics, and how well MMI scores (construct or total) predict future performance in practice. Results from these studies could assist with MMI station development, rater training, score interpretation, and resource allocation. The purpose of this study was to investigate the validity of MMI construct scores (oral communication and problem evaluation), and their relationship to personality measures (emotionality and extraversion) and specific scores from standardized clinical communications interviews (building the relationship and explaining and planning). Confirmatory factor analysis results support a two factor MMI model, however the correlation between these factors was .87. Oral communication MMI scores significantly correlated with extraversion (rc = .25, p \ .05), but MMI scores were not related to emotionality. Scores for building a relationship were significantly related to MMI oral communication scores, (rc = .46, p \ .001) and problem evaluation scores (rc = .43, p \ .001); scores for explaining and planning were significantly related to MMI problem evaluation scores (rc = .36, p \ .01). The results provide validity evidence for assessing multiple non-cognitive attributes during the MMI process and reinforce the importance of developing MMI stations and scoring rubrics for attributes identified as important for future success in school and practice. Keywords Multiple mini-interview  Validation  Confirmatory factor analysis  Personality testing  Admissions

T. Oliver  P. A. Hausdorf  P. Conlon University of Guelph, Guelph, Canada K. Hecker (&) Veterinary Clinical and Diagnostic Sciences, University of Calgary, 3330 Hospital Drive NW, Calgary, AB T2N 4N1, Canada e-mail: [email protected]

123

T. Oliver et al.

Introduction The multiple mini-interview (MMI) is an interview method used in health professional school selection to assess the non-cognitive attributes of applicants (Eva et al. 2004). Whereas more traditional—and often less-structured—interviews have been found to have poor reliability and validity in health professional school selection (Kreiter et al. 2004; Eva et al. 2004; Albanese et al. 2003; Edwards et al. 1990), previous studies have found MMI scores to have sufficient reliability and to be significantly correlated to performance in school and licensure exams (Eva et al. 2009, 2012; Hecker and Violato 2011; Reiter et al. 2007). In addition, there is consistent evidence for the discriminant validity of total MMI scores from ratings of cognitive skill such as incoming grade point average (GPA; Eva et al. 2004, 2009, 2012; Reiter et al. 2007), which suggests that the MMI is measuring something other than cognitive skill. The purpose of the MMI is to assess multiple non-cognitive attributes (e.g. oral communication and ethical reasoning) that are believed to be important for and predictive of future success in the profession and clinical practice. While validation studies have been performed with total MMI scores there is a paucity of information regarding how well MMI scores differentiate the constructs being measured, the relationship between MMI scores (construct or total) to personality characteristics believed to be important for future performance, and how well MMI scores (construct or total) predict future performance in practice. The results of these types of studies could assist with MMI station development, rater training, score interpretation, as well as resource allocation for the development and delivery of this interview method. To better understand the non-cognitive constructs being assessed within an MMI and the use of MMI scores for high stakes admission decisions, convergent and discriminant evidence (validity) is required. Validation theory calls for a clear framework and argument for test score interpretation and use (Kane 2013; Messick 1989). The framework for this study is as follows: If MMI stations are developed to assess multiple non-cognitive constructs that should be predictive for future performance then: 1. The MMI scores should represent the conceptually distinct a priori identified noncognitive attributes of interest (discriminant validity), 2. MMI scores for the non-cognitive attributes should align with conceptually relevant personality characteristics that are predictive for future performance (convergent validity), and; 3. MMI scores and related personality scores should align with conceptually relevant practice based performance assessments (convergent validity). The rest of the introduction provides an overview and rationale of the non-cognitive attributes typically assessed in MMI stations, the conceptually relevant personality characteristics (and associated measures) that have correlated with job related performance, and the conceptually relevant practice based constructs used in this study.

Non-cognitive attributes assessed by the MMI Non-cognitive attributes can include a variety of individual differences related to attitudes, personality traits, and motivations (Schmitt et al. 2009). MMIs have been designed to have raters assess candidates on multiple constructs (e.g. oral communication and moral reasoning) both within and across interview stations. However, MMI measures of different

123

Validating MMI scores

non-cognitive attribute constructs assessed within the same station have been found to be highly correlated (Eva et al. 2004; Lemay et al. 2007; Roberts et al. 2009). As a result, it is common practice to report total scores (i.e. the average score across all measures) within each station instead of construct-based scores. This could limit the interpretability of MMI scores, as it is less clear whether MMI scores are capturing aspects of one construct, the other construct, both, or neither. Two of the more distinct interpersonal constructs that an MMI can attempt to measure are oral communication and problem evaluation. Oral communication is the ability to convey verbal messages constructively; and problem evaluation is the ability to identify and take into account multiple perspectives from various different stakeholders in decision making and judgment. MMI stations typically require candidates to discuss how they would respond if faced with a range of hypothetical situations, for example: an ambiguous problem, an emotionally challenging social interaction, or a complex moral issue. For all of these situations, raters have an opportunity to observe and rate the candidate on the clarity of their language and confidence in their conveyed verbal response (oral communication), and the breadth and depth to which they can explore underlying issues within cases and correctly balance pros and cons for the situation (problem evaluation). The MMI stations, including scenario, scoring sheet and accompanying rubric in this study were purposefully built to assess these two constructs.

MMI measures related to personality characteristics Personality measures can be used to validate measures of non-cognitive attributes assessed during the MMI. Meta-analytic evidence in the organizational sciences literature clearly demonstrates that personality measures are moderately related to job-relevant behaviors (Tett et al. 1991; Rothstein and Goffin 2006). Studies in health settings have also found this moderate relationship (Goffin et al. 2011; Lievens et al. 2009). Three exploratory studies have investigated the relationship between total MMI scores and personality measures (Griffin and Wilson 2012; Jerant et al. 2012; Kulasegaram et al. 2010). The results from these studies found mixed evidence that MMI scores were related to agreeableness (a personality trait that also measures individuals’ tendencies to feel empathy with others) and extraversion, and no evidence that MMI scores were related to conscientiousness, emotional stability, or openness. One explanation for why mixed results were found for the relationship between personality traits and MMI scores may be because personality traits were correlated with an overall MMI score. It could be that when personality traits are compared to MMI scores of specific constructs, it is possible to draw clearer conceptual relationships that could point to where there should be stronger empirical relationships. Two personality traits that are likely to be related to health professionals’ interpersonal performance are emotionality and extraversion (Ashton and Lee 2007). People who have high emotionality tend to feel empathy and sentimental attachments with others, are sensitive to dangerous and stressful situations, and feel dependent on the emotional support from others; people who are extraverted tend to feel confident when leading or addressing groups of people, enjoy social gatherings and interactions, and frequently experience positive feelings of enthusiasm and energy. Studies in health professional schools have found that emotionality and extraversion are related to performance outcomes. For example, high sentimentality has been found to predict clinical performance of residents (Gough et al. 1991). Students with higher emotionality have tendencies to empathize and understand the needs of others, which leads to

123

T. Oliver et al.

more positive interactions with patients and health teams. In addition, extraversion has been found to be a strong predictor of grade point average for students in years when they are on practicum and internship suggesting that this personality trait is predictive of performance scores that assess effective communication skills and interpersonal behaviour (Lievens et al. 2009). In this study we were interested in the relationship between MMI construct score (oral communication and problem evaluation) and personality construct score (extraversion and emotionality). Specifically, we hypothesized (as outlined below) that oral communication and extraversion will be related, and problem evaluation and emotionality will be related. Extraversion is a trait that includes tendencies such as acting confident with others and expressing enthusiasm and energy (Ashton and Lee 2007). These tendencies are most directly measured within the MMI when assessing for oral communication, which includes ratings for confidence, eye contact, and liveliness in communicating. Emotionality is a trait that includes tendencies such as being familiar with the anxieties and fears that come with stressful situations, and feeling emotional connections with others (Ashton and Lee 2007). In other words, individuals who score high on emotionality tend to have more experience feeling a range of emotions and discussing those emotions with others. Thus, it was expected that emotionality would be most strongly related to the problem evaluation in the MMI, as problem evaluation measures the depth and openness that candidates explore interpersonal perspectives and sensitivities.

MMI measures related to future performance Overall MMI scores have been related to Objective Structured Clinical Examination (OSCE) scores of patient interaction and ethical/legal reasoning (Eva et al. 2009, 2012). The current study seeks to extend these findings by investigating the relationship between MMI measures of two distinct constructs (oral communication and problem evaluation) and communication skill interview scores of students’ effectiveness in building a relationship (i.e. build a patient’s or client’s feelings of rapport and trust with the practitioner) and effectiveness in explaining and planning (i.e. build a patient’s or client’s understanding and motivation to support an action plan; Silverman et al. 2005). By comparing MMI scores to specific performance criteria, it is possible to test whether different MMI measures have stronger predictive relationships with the more conceptually related measures of performance (Chan 2005). Oral communication should be more closely related to building a relationship and problem evaluation should be more closely related to explaining and planning. Oral communication encompasses the ability to use verbal and nonverbal messages to effectively guide and direct the interview. It is a non-cognitive skill that can help to convey respect for others (Klein et al. 2006). Problem evaluation encompasses the ability to directly express an opinion and a willingness to identify and understand multiple perspectives.

Research objectives and hypotheses To assess our validation framework the research objectives of this study were: (1) to investigate if different MMI scores measure distinct non-cognitive attributes; and (2) to determine if MMI scores (construct specific or total MMI scores) are related to

123

Validating MMI scores

conceptually relevant personality measures and conceptually relevant scores in a standardized clinical communication interview. Three sets of hypotheses were proposed. H1 Given the explicit measurement and distinctiveness of oral communication and problem evaluation, there will be a stronger model fit for a 2-factor solution for MMI scores than for a 1-factor solution. H2a Oral communication MMI scores will be positively related to building the relationship score in a communication interview. H2b Problem evaluation MMI scores will be positively related to explaining and planning scores in a communication interview. H3a Oral communication MMI scores will be positively related to extraversion scores measured by the HEXACO-PI-R-60 (Ashton and Lee 2009). H3b Problem evaluation MMI scores will be positively related to emotionality scores measured by the HEXACO-PI-R-60 (Ashton and Lee 2009).

Method Sample Participants were from a single cohort of first year students in the Doctor of Veterinary Medicine (DVM) program at the Ontario Veterinary College (OVC). Prior to being accepted into the program, 186 candidates were required to take the MMI as part of the admissions process. All 186 candidates consented to release their MMI results. From this candidate pool, 102 were admitted into the DVM program. Sixty of these students (59 %) consented to release their MMI and communication interview results for the purposes of the study, and volunteered to complete personality testing. The students completed the standardized clinical communication interview and personality testing approximately 8 months after completing the MMI. The study was approved by the University of Guelph Research Ethics Board. Measures MMI The MMI consisted of eight 10-min stations, with two raters per station who each independently rated the participants on two constructs. The development of the MMI followed the description outlined in Hecker et al. (2009). The majority of the stations were developed at the University of Calgary, Canada and modified by the admissions committee at OVC. Minor modifications included: contextualing the station specific scenarios to address the goals and objective of OVC and using two 5 point items to score candidates as outlined below. The eight stations were meant to assess oral communication and problem evaluation for a range of issues relevant to success as a veterinarian. These issues were ethical and moral (2 stations), interpersonal (3 stations), intrapersonal (1 station), and professional (2 stations). At each station raters scored candidates on two items, one for each construct. Each item was scored on a scale of 1–5 (1 = unacceptable; 3 = meets expectations; 5 = exceptional) and each scale had a corresponding rubric which raters were trained to

123

T. Oliver et al.

use in a one and a half hour rater training session before the MMI. The scores for oral communication and problem evaluation were summed across all eight stations to produce a score for oral communication and one mean score for problem evaluation. Communication interview The standardized clinical communication interviews were initially designed by Adams and Ladner (2004) with the consultation of practicing veterinarians. Each participant participated in two communication interview stations designed to assess participants’ use of effective communication skills. Medical and technical knowledge requirements were minimal. The simulated client rated the participant immediately after each station on 7 items (using a 9 point scale) meant to assess two constructs, building the relationship (4 items) and explaining and planning (3 items). The two scores within each station were converted to a T-score to account for differences in simulated client scores between stations (Howell 2002). The mean of each participant’s T-score across the two stations was calculated to use as his/her building the relationship and explaining and planning score. Personality The personality traits of emotionality and extraversion were measured with the HEXACOPI-R-60 (Ashton and Lee 2009). The 10 item emotionality scale assesses participants’ tendencies to feel empathy and sentimental attachment with others, the need to receive emotional support from others, and feel fear and anxiety to stress and danger (Lee and Ashton 2008). The 10-item extraversion scale was used to assess participants’ tendencies to feel confident and positive about themselves when leading others, and to feel energy, enthusiasm, and enjoyment when with others. Analysis Generalizability theory and Cronbach’s alpha were used to calculate the reliability of the MMI. The generalizability analysis of the MMI was a partially nested design with the following facets, participant (n = 186), station (n = 8), rating item (n = 2), and rater nested within station (n = 2). Cronbach’s alpha was used to determine the internal consistency of the communication interview scores of the 4-item building the relationship and the 3-item explaining and planning scales. To address the first research objective, data from all 186 candidates were analyzed using confirmatory factor analysis (CFA). CFA provides a fundamental strength over correlational analyses because it can account for station (method) and construct (trait) effects (Brown 2006). The models presented here can be thought of as an extension of a multi-trait multi-method matrix analysis (MTMM; Campbell and Fisk 1959) where each station is considered a method and each method assesses both constructs. CFA has been used in the organizational sciences to assess discriminant validity for within-method ratings that were not identifiable in correlation analyses (Hoffman et al. 2011). A priori two models were hypothesized, one with one underlying factor and the second with two underlying factors. Each station measured two traits, oral communication and problem evaluation. Given there were only two ratings per method (i.e. station) a correlated uniqueness model was used where the errors for each within-station score is correlated (Brown 2006). These models were developed and tested in EQS 6.1 (Bentler 2004).

123

Validating MMI scores

The fit indices used were the root mean squared error of approximation (RMSEA), the standardized root mean square residual (SRMR), and Bentler’s comparative fit index (CFI). Between model differences were assessed using the Dv2 reduction. To address the second research objective, single-tailed Pearson’s correlations were used to examine the hypothesized relations between personality scores, MMI ratings (both construct scores and total MMI scores), and communication scores for the 60 students that consented to be part of the study. There was more variance in the MMI scores from the original sample of candidates (186) than the subsample of participants (60) who were accepted into the DVM program (and hence on average had higher MMI scores than the general candidate pool). To address this issue, correlations involving the MMI were corrected using Thorndike’s (1949) formulas to account for direct range restriction.

Results The reliability for the MMI was Ep2 = .73, suggesting that scores for participants on oral communication and problem evaluation were generally consistent across raters and stations. Considerably more variance was accounted for by participant by station (14.4 %) and participant by rater nested within station (22.8 %) than by participant by item (3.4 %; Table 1). Figure 1 and Table 2 compares the results from the 1- and 2-factor correlated uniqueness models of the MMI. Figure 1 presents the factor loadings, correlated errors and factor correlations and Table 2 displays the CFA fit indices. The 2-factor solution had a significant Dv2 reduction (Dv2 - 35.52, p \ .001) from the 1-factor solution. Furthermore, the CFI of .97 and the root mean-square error of approximation (RMSEA) of 0.05 all indicated that the 2-factor solution had strong model fit. Thus, as predicted by H1, the model fit indices suggest that the 2-factor solution is more effective at explaining the variance in MMI scores than the 1-factor solution. Caution should be taken when interpreting these results. Though the model fit indicators suggest that there is a conceptually viable 2-factor MMI model, the correlation between Table 1 Generalizability analysis for participant (n = 186), station (n = 8), item (n = 2), and rater nested within station (n = 2 per station) Source Participant (p)

df

SS

r2

MS

%

185

1,273.821

6.886

0.158

Station (s)

7

126.839

18.120

0.016

1.77

Rater (r)|Station

8

4.762

0.595

-0.001

-0.07

Item (i)

17.38

1

53.633

53.633

0.016

1.76

p*s

1,295

1,723.005

1.331

0.131

14.44

p*r|s

22.76

1,480

1,028.488

0.695

0.207

p*i

185

164.773

0.891

0.031

3.43

s*i

7

37.946

5.421

0.013

1.45 0.07

i*r|s

8

3.241

0.405

0.001

p*s*i

1,295

506.898

0.391

0.055

6.07

p*i*r|s, error

1,480

416.009

0.281

0.281

30.92

Total

0.909 2

N = 186. G coefficient = 0.73. SS sum of squares, MS mean squares, r variance components

123

T. Oliver et al.

A

B

Fig. 1 One factor (a) and two-factor (b) correlated uniqueness models. S—station, Com—oral communication score, PE—problem evaluation score

123

Validating MMI scores

error values and the correlation between the two factors suggest that the model may be difficult to interpret. As displayed in Fig. 1, standardized factor loadings ranged from .55 to .71 for oral communication, and from .32 to .58 for problem evaluation (all with p \ .001). The correlated uniqueness values for scores within the same station range from .48 to .67, and more importantly the correlation between the oral communication and problem evaluation factors was extremely high (r = .87). These results suggest that, while there is a good fit and rational for a 2-factor model over a 1-factor model, there are method (station) and trait (attributes measured) effects present which limit the ability to conclude we are assessing two independent factors. Table 3 displays the means, standard deviations, minimum and maximum scores, and internal consistency reliability for scores on the MMI, communication interview, and personality index. Table 4 displays the correlations among personality, MMI scores (construct and total), and communication interview scores. Our second set of hypotheses were supported with significant correlations between MMI oral communication and building the relationship scores (rc = .46, p \ .001), and between MMI problem evaluation and explaining and planning scores (rc = .43, p \ .001). In addition, there was an unexpected positive correlation between MMI problem evaluation and building the relationship (rc = .37, p \ .01), whereas there was no significant correlation between MMI oral communication

Table 2 Confirmatory factor analysis fit indices for two correlated uniqueness models Model

RSMEA

SRMR

CFI

v2

df

Dv2

1 Factor

0.07

0.06

0.94

173.60

96

2 Correlated factors

0.05

0.06

0.97

138.08

95

-35.52***

Model 1 corresponds to Fig. 1a, a one factor solution and Model 2 corresponds to Fig. 1b, a two factor solution SRMR standardized root mean square residual, RMSEA root mean-square error of approximation, CFI comparative fit index *** p \ .001

Table 3 Descriptive statistics and internal consistency results for the MMI, communication interview, and HEXACO-PI-R 60 (personality) Mean

SD

Min

Max

a

Personality Emotionality

3.19

0.71

1.50

4.40

.79

Extraversion

3.36

0.60

1.80

4.40

.82

Oral communication

61.57

7.59

43.00

74.00

.80

Problem evaluation

59.28

5.96

49.00

73.00

.54

120.85

12.59

93.00

147.00

.83

Building a relationship

51.16

7.71

26.14

69.53

.86

Explaining and planning

49.08

6.49

34.80

66.34

.86

MMI

Total score Communication interview

N = 60

123

123

.28*

Explaining and planning

-.11

.27* .36** (.43)**

.12 (.13)

.31** (.37)**

.42***

.91***

(.46)***

.95***

.73***

Problem evaluation

(.28)*

.24*

(.46) ***

.40**

Total MMI

.29*

Building the relationship

Explaining and planning

Communication interview

* Correlation is significant at the .05 level (1-tailed)

** Correlation is significant at the .01 level (1-tailed)

*** Correlation is significant at the .001 level (1-tailed)

N = 60. Correlations in brackets corrected for range restriction

where rxy is the observed correlation between X and Y in the restricted sample, si is the estimated standard deviation of i in the restricted sample, Si is the estimated standard deviation of i in the unrestricted sample, Rxy is the estimated corrected correlation between X and Y in the unrestricted sample when only the restricted sample has been used

x xy ffi Rxy ¼ pffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi 2 2 2 2 2

S r Sx rxy þsx sx rxy

Direct range restriction

.19

.19 (.22)*

-.01

(-.01)

.11 (.13)

.08

(.10)

.23* (.25)*

-.08

(-.09)

.04

Building the relationship

Communication interview

Total MMI

Problem evaluation

Oral communication

MMI ratings

Extraversion

Emotionality

Personality

Oral communication

Emotionality

Extraversion

MMI ratings

Personality

Table 4 Corrected and uncorrected correlations between personality, MMI ratings, and communication interview ratings

T. Oliver et al.

Validating MMI scores

and explaining and planning (rc = .13, ns). In contrast, there were significant positive correlations between total MMI score and building the relationship (rc = .46, p \ .001) and explaining and planning scores (rc = .28, p \ .05). For our third set of hypotheses, MMI oral communication was significantly related to extraversion (rc = .25, p \ .05). However, the hypothesized positive relationships between problem evaluation and emotionality was not supported (rc = .10, ns). Total MMI score was significantly related to extraversion (rc = .22, p \ .05) but not emotionality (rc = -01, ns). Finally, there was a significant correlation between extraversion and building the relationship (r = .27, p \ .05) and emotionality and explaining and planning (r = .28, p \ .05).

Discussion The validation framework for this study was—If MMI stations are developed to assess multiple non-cognitive constructs that should be predictive for future performance then: 1. The MMI scores should represent the conceptually distinct a priori identified noncognitive attributes of interest (discriminant validity), 2. MMI scores for the non-cognitive attributes should align with conceptually relevant personality characteristics that are predictive for future performance (convergent validity), and; 3. MMI scores and related personality scores should align with conceptually relevant practice based performance assessments (convergent validity). The major findings from this study were: 1. There was support for a two factor model, however, the oral communication and problem evaluation constructs were highly correlated both within the model (.87) and the correlation analyses with the actual data (.73; Table 4). 2. Oral Communication MMI score was significantly correlated with extraversion (small but significant) and building the relationship scores, supporting Hypotheses 2a and 3a. 3. Problem evaluation MMI score was not significantly related to emotionality score but did correlate with building the relationship (not hypothesized) and explaining and planning, thus not supporting Hypothesis 2b but supported Hypothesis 3b. 4. Total MMI score had a weak but significant correlation with extraversion, and significant correlations with building the relationship and explaining and planning. The first research objective was to investigate if different MMI scores measure distinct non-cognitive attributes. While there was a stronger and significantly better model fit for a two factor model (Fig. 1) than a one factor model, the two constructs were highly correlated (.87). Thus while there was support for a two factor model, caution must be taken in concluding that we are measuring two truly distinct factors as there was weak evidence for discriminant validity between the two construct scores. If this were the end of the study (i.e. no predictive work followed) the practical implications of these findings could be outlined as the following. MMI station construction and scoring rubrics don’t really matter because we can’t parse out the non-cognitive attributes of interest OR it takes too much time and effort to develop meaningful MMI stations to assess important conceptually distinct constructs. However, given our findings from our second research objective as outlined below, there is evidence for investing the time and effort in MMI station

123

T. Oliver et al.

construction, creating appropriate scoring rubrics based upon attributes known to be predictive for future performance and conducting rater training to ensure appropriate and fair assessment of the candidate. The second research objective was to test whether the MMI measures fit within the nomological network for non-cognitive constructs. Interestingly, even though the MMI scores of oral communication and problem evaluation were highly related, they were found to have different relationships to other measures of non-cognitive constructs. MMI scores for oral communication were significantly related to the conceptually relevant measures of building the relationship and the personality trait of extraversion (supporting hypotheses 2a and 3a). In contrast, the oral communication MMI score was not significantly related to less conceptually relevant measures of explaining and planning and the personality trait of emotionality. If we had only used total MMI score we would not have teased apart the relationship between oral communication and the nomological network for these noncognitive attributes. Consistent with our hypothesis (2B), the MMI problem evaluation rating had a significant positive relationship with explaining and planning. However, the hypothesized relationship between the MMI problem evaluation measure and emotionality was not found (hypothesis 3B). One explanation for why we did not find a relationship between the MMI problem evaluation measure and emotionality is that the MMI did not effectively measure students’ ability to empathize with others. The current MMI measured students’ ability to recognize the points of views of others, but it did not measure how students’ would express feelings of empathy towards others. One way that this could be done is to include stations that require candidates to engage more directly in an interaction. Eva et al. (2004) provided an example of such a station where a candidate must engage in a direct conversation with a ‘standardized actor’, rather than simply explain how they would interact. However, such stations could be designed to be more content relevant to ensure that the participants behaviors observed during the MMI are more likely to transfer to health practitioner contexts (Donnon and Paolucci 2008). Another explanation for why no significant relationship was found between MMI problem evaluation and emotionality is that emotionality is a broad personality trait. As a broad personality trait, emotionality measures a broad range of individual attributes (e.g. empathy towards others, sensitivity to physical harm). Narrow traits are often found to be more predictive than broad traits when there is a strong conceptual link to a specific criterion (Rothstein and Goffin 2006; Tett et al. 1991). For example, in Gough et al.’s (1991) study, the narrow trait of empathy was found to be predictive of anesthesiology residents’ clinical performance more than the broader trait of agreeableness. Thus, future MMI validation research may want to investigate the relationship between MMI measures and conceptually-relevant narrow traits. Finally, our findings support previous reports of the validity of the MMI with future performance (Eva et al. 2009, 2012; Reiter et al. 2007). We extended these previous findings and provide evidence of convergent validity of MMI scores by exploring the nomological network between personality testing, MMI scores and performance on communication interviews. Testing these three measures within the same study addresses the concern raised by Jerant et al. (2012) that since MMI ratings are related to broad personality traits, such as extraversion, then selecting candidates based on the MMI may reduce the diversity in medical students’ personality. This should be considered a concern if the MMI was related to personality facets that were unrelated to performance of job relevant criteria. Any valid selection process should lead to the selection of a more homogenous group of successful candidates from the general applicant pool, wherein the

123

Validating MMI scores

successful applicants are homogenous only on the characteristics that lead to in-school or in-job success. The results from our study are consistent with previous findings (Klein 2009; Gough et al. 1991; Lievens et al. 2009) that demonstrate that emotionality and extraversion are related to effective explaining and planning and building a relationship when communicating with clients/patients. Thus, if these are personality traits that can lead to better performance within the clinical interview and potentially better health outcomes, then it can be argued MMI scenarios should be designed to assess attributes related to these traits. This includes creating scoring rubrics to match the constructs being measured and the need for appropriate rater training. There are limitations to this study. This study was limited to the test of one MMI conducted at one university in one country. MMI are often customized to meet the selection needs of their respective universities, and MMI are often revised within universities on a year-to-year basis. Therefore, the results from this study may not generalize to MMI conducted at different schools. In addition, the study had a limited sample size, with a moderate participation rate (59 %) in the personality testing phase. It is possible that the sample of students who volunteered for this study may differ from students who did not volunteer, or with students from other universities or medical disciplines. In conclusion, this study provides further evidence that the MMI (both construct specific and total score) is predictive of student performance in a communication interview. Furthermore, we have advanced further insight to the question ‘‘why the MMI is predictive?’’ The results from our study suggest that researchers and practitioners are more likely to find support for the validity of their MMI measures if they clearly define the aspects of the noncognitive attributes they are intending to assess, and test the validity of these measures with other specific construct-relevant measures. Acknowledgments An earlier version of this manuscript was submitted as part of the first author’s dissertation. The authors would like to acknowledge the constructive comments from Filip Lievens, Deborah Powell, and Rick Goffin on earlier versions of the this manuscript.

References Adams, C. L., & Ladner, L. (2004). Implementing a simulated client program: Bridging the gap between theory and practice. Journal of Veterinary Medical Education, 31, 138–145. Albanese, M. A., Snow, M. H., Skochelak, S. E., Huggett, K. N., & Farrell, P. M. (2003). Assessing personal qualities in medical school admissions. Academic Medicine, 78, 313–321. Ashton, M. C., & Lee, K. (2007). Empirical, theoretical, and practical advantages of the HEXACO model of personality structure. Personality and Social Psychology Review, 11, 150–166. Ashton, M. C., & Lee, K. (2009). The HEXACO-60: A short measure of the major dimensions of personality. Journal of Personality Assessment, 91, 340–345. Bentler, P. M. (2004). EQS 6 structural equations program manual. Encino, CA: Multivariate Software Inc. Brown, T. A. (2006). Confirmatory factor analysis for applied research. New York, NY: Guilford Press. Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological Bulletin, 56, 81–105. Chan, D. (2005). Current directions in personnel selection. Current Directions in Psychological Science, 14, 220–223. Donnon, T., & Paolucci, E. O. (2008). A generalizability study of the medical judgment vignettes interview to assess students’ noncognitive attributes for medical school. BMC Medical Education, 8, 58. Edwards, J. C., Johnson, E. K., & Molidor, J. B. (1990). The interview in the admission process. Academic Medicine, 65, 167–177. Eva, K. W., Reiter, H. I., Rosenfeld, J., Trinh, K., Wood, T. J., & Norman, G. R. (2012). Association between a medical school admission process using the multiple mini-interview and national licensing examination scores. Journal of the American Medical Association, 308, 2233–2240.

123

T. Oliver et al. Eva, K. W., Reiter, H. I., Trinh, K., Wasi, P., Rosenfeld, J., & Norman, G. R. (2009). Predicting validity of the multiple mini-interview for selecting medical trainees. Medical Education, 43, 767–775. Eva, K. W., Rosenfeld, J., Reiter, H. J., & Norman, G. R. (2004). An admissions OSCE: The multiple miniinterview. Medical Education, 38, 314–326. Goffin, R. D., Rothstein, M. G., Reider, M. J., Poole, A., Krajewski, H. T., Powell, D. M., et al. (2011). Choosing job-related personality traits: Developing valid personality-oriented job analysis. Personality and Individual Differences, 51, 646–651. Gough, H. G., Bradley, P., & McDonald, J. S. (1991). Performance of residents in anesthesiology as related to measures of personality and interests. Psychological Reports, 68, 979–994. Griffin, B., & Wilson, I. (2012). Associations between the big five personality factors and multiple miniinterviews. Advances in Health Science Education, 17, 377–388. Hecker, K., Donnon, T., Fuentealba, C., Hall, D., Illanes, O., Morck, D. W., et al. (2009). Assessment of applicants to the veterinary curriculum using a multiple mini interview method. Journal of Veterinary Medical Education, 36(2), 166–173. Hecker, K., & Violato, C. (2011). A generalizability analysis of a veterinary school multiple mini interview: Effect of number of interviewers, type of interviewers and number of stations. Teaching and Learning in Medicine, 23, 331–336. Hoffman, B. J., Melchers, K. G., Blair, C. A., Kleinmann, M., & Ladd, R. T. (2011). Exercises and dimensions are the currency of assessment centers. Personnel Psychology, 64, 351–395. Howell, D. C. (2002). Statistical methods for Psychology (5th ed.). Pacific Grove, CA: Duxbury/Thomson Learning. Jerant, A., Griffin, E., Rainwater, J., Henderson, M., Sousa, F., Bertakis, K. D., et al. (2012). Does applicant personality influence multiple mini-interview performance and medical school acceptance offers? Academic Medicine, 87, 1–10. Kane, M. T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement, 50(1), 1–73. doi:10.111/jedm.12000. Klein, C. (2009). What do we know about interpersonal skills? A meta-analytic examination of antecedents, outcomes, and the efficacy of training. Doctoral dissertation, University of Central Florida. Klein, C., DeRouin, R. E., & Salas, E. (2006). Uncovering workplace interpersonal skills: A review, framework, and research agenda. In G. P. Hodgkinson & J. K. Ford (Eds.), International review of industrial and organizational psychology (Vol. 21, pp. 80–126). New York: Wiley. Kreiter, C. D., Yin, P., Solow, C., & Brennan, R. L. (2004). Investigating the reliability of the medical school admissions interview. Advances in Health Science Education, 9, 147–159. Kulasegaram, K., Reiter, H. I., Wiesner, W., Hackett, R. D., & Norman, G. R. (2010). Non-association between Neo-5 personality tests and multiple mini-interview. Advances in Health Sciences Education, 15, 415–423. Lee, K., & Ashton, M. C. (2008). The HEXACO personality factors in the indigenous personality lexicons of English and 11 other languages. Journal of Personality, 76, 1001–1053. Lemay, J. F., Lockyer, J., Collin, V. T., & Brownell, K. W. (2007). Assessment of non-cognitive traits through the admissions multiple mini-interview. Medical Education, 41, 573–579. Lievens, F., Ones, D., & Dilchert, S. (2009). Personality scale validities increase throughout medical school. Journal of Applied Psychology, 94, 1514–1535. Messick, S. (1989). Validity. In R. L. Linn (Ed.), Educational measurement (3rd ed., pp. 13–103). New York: Macmillan. Reiter, H. I., Eva, K. W., Rosenfeld, J., & Norman, G. R. (2007). Multiple mini-interviews predict clerkship and licensing examination performance. Medical Education, 41, 378–384. Roberts, C., Zoanetti, N., & Rothnie, I. (2009). Validating a multiple mini-interview question bank assessing entry-level reasoning skills in candidates for graduate-entry medicine and dentistry programmes. Medical Education, 43, 350–359. Rothstein, M. G., & Goffin, R. (2006). The use of personality measures in personnel selection: What does current research support? Human Resource Management Review, 16, 155–180. Schmitt, N., Keeney, J., Oswald, F. L., Pleskac, T. J., Billington, A. Q., Sinha, R., et al. (2009). Prediction of 4-year college student performance using cognitive and noncognitive predictors and the impact on demographic status of admitted students. Journal of Applied Psychology, 94, 1479–1497. Silverman, J. D., Kurtz, S. M., & Draper, J. (2005). Skills for communicating with patients (2nd ed.). Oxford: Radcliffe Publishing. Tett, R. P., Jackson, D. N., & Rothstein, M. G. (1991). Personality measures as predictors of job performance: A meta-analytic review. Personnel Psychology, 44, 703–742. Thorndike, R. L. (1949). Personnel selection: Test and measurement techniques. New York: Wiley.

123

Validating MMI scores: are we measuring multiple attributes?

The multiple mini-interview (MMI) used in health professional schools' admission processes is reported to assess multiple non-cognitive constructs suc...
377KB Sizes 1 Downloads 0 Views