Sensitivity to change and concurrent validity of direct behavior ratings for academic anxiety.

© 2014 American Psychological Association 1045-3830/15/$ 12.00 http://dx.doi.org/10.1037/spq0000083

School Psychology Quarterly 2015, Vol. 30, No. 2, 244-259

Sensitivity to Change and Concurrent Validity of Direct Behavior Ratings for Academic Anxiety Nathaniel P. von der Embse, Emma-Catherine Scott, and Stephen P. Kilgus East Carolina University Multitiered frameworks of service delivery have traditionally underserved students with mental health needs. Whereas research has supported the assessment and intervention of social and academic behavior across tiers, evidence is limited with regard to mental health concerns including internalizing behaviors (e.g., anxiety and depression). In particular, there is a notable shortage of brief anxiety assess ment tools to be used for progress monitoring purposes. Moreover, traditional omnibus rating scale approaches may fail to capture contextually dependent anx iety. The purpose of the present investigation is to examine the sensitivity to change and concurrent validity of Direct Behavior Ratings (DBR; Chafouleas, RileyTillman, & Christ, 2009; Chafouleas, Riley-Tillman, & Sugai, 2007) of anxiety and traditional rating scales in measuring academic anxiety directly before, during, and after a potentially anxiety provoking stimulus. Research was conducted with 115 undergraduate students in a Southeastern university. Results indicated significant relationships between DBRs and pre- and postmeasures of anxiety. Change metrics suggested an overall lack of correspondence between DBR and the criterion measure, with DBR scales detecting greater change both across the testing situation and participants. The use of DBR for anxiety is considered within a multitiered, problem-solving framework. Feasibility and limitations associated with implemen tation are discussed. Keywords: anxiety, self-monitoring, brief assessment, direct behavior rating

In recent years, school-based service deliv ery has moved from a model characterized as “refer-test-place” toward a comprehensive, m ultitiered, prevention-focused model founded on problem-solving logic and databased decision making. Research has resulted in support for such a novel model in the domains of academics (e.g., response to in tervention; Rtl) and social behavior (e.g., pos itive behavior and intervention supports; PBIS) (Riley-Tillman, Burns, & Gibbons,

This article was published Online First October 6, 2014. Nathaniel P. von der Embse, Emma-Catherine Scott, and Stephen P. Kilgus, Department of Psychology, East Caro lina University. Nathaniel P. von der Embse is now at Temple University. Emma-Catherine Scott is now at East Carolina University. Stephen P. Kilgus is now at University of Missouri. Correspondence concerning this article should be ad dressed to Nathaniel P. von der Embse, NCSP 1301 Cecil B. Moore Ave. Ritter Hall/Ritter Annex (003-00) Temple Uni versity Philadelphia, PA 19122-6091. E-mail: nate.v@ temple.edu

2013). In contrast, there is limited research to support school-based assessment and inter vention of mental health concerns, thus reduc ing the potential adoption of multitiered frameworks within this domain (McIntosh, Ty, & Miller, 2013). Competing administra tive commitments and limited school re sources have likely contributed to the under identification and intervention for mental health concerns (Davis, Kruczek, & McIntosh, 2006). Federal legislation such as the Individuals with Disabilities Education Improvement Act (IDEA; 2004) and the ever-increasing empha sis on high-stakes testing has driven educa tional decision makers to focus efforts on identifying and intervening with academic and social behavior problems (i.e., external izing behaviors) that disrupt classroom activ ities and lower test performance (Huang et al., 2005; Tolan & Dodge, 2005). Meanwhile, students with mental health concerns, includ ing those with disturbances of emotion or mood, may not receive the services necessary 244

DIRECT BEHAVIOR RATING OF ANXIETY

to be successful in school. The purpose of the present investigation is to develop novel as sessment solutions that are required to sup port identification and intervention of mental health concerns within a multitiered frame work. Prevalence of Mental Health Concerns in Schools It is estimated that 20% of students have a diagnosable mental health disorder (Merikangas et al., 2010), yet only 20% of those ex hibiting signs of mental health problems ever receive services (Hoagwood & Johnson, 2003). Students with externalizing behavior problems (e.g., hyperactive, aggressive, dis ruptive behavior) are more easily identified by educators (McIntosh et al., 2013). Students with internalizing problems, characterized by inhibited behavior and maladaptive overreg ulation of emotional states (Merrell, 2008), are less likely to be referred for services as they are less likely to disrupt instruction and symptoms are difficult to observe (Gresham & Kern, 2004; Kalberg, Lane, Driscoll, & Wehby, 2011). Internalizing problems in clude anxiety, depression, social withdrawal, and somatic complaints (Merrell & Gueldner, 2010). Although both externalizing (e.g., op positional defiant disorder) and internalizing (e.g., depression) disorders are within the broad purview of “mental health,” the focus of the present investigation is on disturbances of mood and emotion (e.g., anxiety). Anxiety is the most common mental health problem in children, with prevalence ranging from 3% to 41% in the preadolescent popu lation (C artw right-H atton, McNicol, & Doubleday, 2006). Symptoms can include physiological arousal (e.g., perspiration), so matic complaints (e.g., stomach aches), avoidance behaviors (e.g., social withdrawal), and maladaptive cognitions (e.g., excessive fear, worry). Within school settings, common forms of anxiety include specific phobias, school refusal, social anxiety, separation anx iety, and test anxiety (Merrell, 2008). Anxiety is generally classified as either (1) trait anxi ety that is a stable characteristic of high levels of emotional arousal across many situations or (2) state anxiety that is often temporary and

245

situational specific (Spielberger & Vagg, 1995). Academic anxiety is a situation-specific form of anxiety consisting of cognitive, phys iological, and behavioral responses directly related to educational contexts (e.g., class room instruction, test completion; Cassady, 2010). This phenomenon may accompany concern about potential negative conse quences of failure on a test (e.g., test anxiety; Zeidner, 1998) or specific academic subjects (e.g., math anxiety, foreign language anxiety; Hembree, 1990). High levels of academic anxiety can interfere with concentration and memory on a given task, preventing success ful performance. The persistence of academic anxiety may result in school refusal, dropping out, and reduced career opportunities (Dono van & Spence, 2000; Rapee, Kennedy, In gram, Edwards, & Sweeney, 2005). Anxiety toward testing is associated with impaired academic perform ance, lower course grades, decreased motivation, and in creased stress (Hembree, 1988; von der Embse & Hasson, 2012; von der Embse, Mata, Segool, & Scott, 2014). Researchers have proposed a biopsychosocial theory of test anxiety with three distinct components in cluding: biological (i.e., within child factors, physiological manifestations), psychological (e.g., cognitive perceptions) and social factors (e.g., parent/teacher expectations; Lowe et al., 2008). Each component may influence test performance in a different manner; for exam ple, anxiety from social expectations may im prove performance (von der Embse & Witmer, 2014). Research has suggested that test anxiety has increased since the implementa tion of high-stakes testing policies (Putwain, 2007; Wren & Benson, 2004) and may nega tively affect up to 40% of students (Cizek & Burg, 2006; Gregor, 2005). Despite the neg ative effects, academic and test anxiety are not formally recognized disorders within the Diagnostic and Statistical Manual of Mental Disorders (5th ed.; DSM-V; American Psy chiatric Association, 2013). Educators are now tasked with preparing their students to perform to the best of their abilities in highstress, high-stakes testing situations. How ever, schools are largely remiss in identifying students with academic anxiety and teaching children essential self-regulatory skills

246

VON DER EMBSE, SCOTT, AND KILGUS

(Mayer, Roberts, & Barsade, 2008). Multit iered, prevention focused service delivery models hold promise in identifying students with internalizing behaviors, including anxi ety specific to academic contexts. Tiered Models of Mental Health Service Delivery Initial evidence suggests that school-wide PB1S (SWPBIS) may indirectly improve in ternalizing problems. For instance, universal antecedent-based strategies common within SWPBIS may indirectly reduce anxiety by removing threatening stimuli (e.g., bullying, disruption from classroom instruction; McIntosh et al, 2013). Furthermore, SWPBIS’s focus on teaching and reinforcing adaptive responses may promote effortful control and the use of appropriate replacem ent behaviors within challenging situations, including tests (Brad shaw, Waasdorp, & Leaf, 2012; McIntosh et al., 2013). At the Tier II level, schools may implement curricula that emphasize selfawareness (Akin-Little, Little, Bray, & Kehle, 2009) and peer modeling (Miller, Shumka, & Baker, 2012). However, there is a notable lack of brief progress monitoring assessments needed to evaluate the effectiveness of inter ventions for internalizing behaviors within a multitiered framework. It is unfortunate that research has yet to demonstrate the relevance of modern, multit iered models of service delivery for internal izing behavior problems. If these frameworks are to gain widespread adoption and use within schools, research is needed to develop innovative assessment and intervention solu tions. Several researchers have proposed mul titiered intervention frameworks for anxiety (Sulkowski, Joyce, & Storch, 2012) and in terventions for depression that can be em ployed across tiers of service (Carnevale, 2013). Within the assessment domain, re search has begun to amass with regard to universal screening for mental health con cerns, including internalizing behaviors. There are several existing universal screeners including the Student Internalizing Behavior Screener (SIBS; Cook et al., 2011), Strengths and Difficulties Questionnaire (SDQ; Good man, 1997), and Behavioral and Emotional Screening System (BESS; Kamphaus &

Reynolds, 2007). However, more work is needed within the area of progress monitoring for interventions targeting internalizing be haviors such as anxiety and depression. Progress Monitoring of Mental Health Interventions Appropriate and change-sensitive assess ments for measuring response to intervention may influence the success or failure of a multi tiered service delivery system (Chafouleas, Sanetti, Kilgus, & Maggin, 2012; Gresham et al., 2010). Problem-solving in Tiers II and III re quires contextually specific data collection to inform intervention, monitor response to inter vention, and evaluate intervention outcomes (Christ, Riley-Tillman, & Chafouleas, 2009). Therefore, progress monitoring assessment pro cedures should be technically adequate, sensi tive to small changes in behavior, and efficient (Gresham et al., 2010). As educational decision making practices shift from summative assess ment to formative assessment, efficient and con textually specific assessment procedures have become increasingly valuable. Previous mental health intervention research has employed var ious progress monitoring procedures including (a) behavior rating scales and (b) Direct Behav ior Rating-multiple items scales (DBR-MIS; Christ et al., 2009). The extent to which each is appropriate for progress monitoring may be evaluated relative to the criteria established by Christ et al. (2009); progress monitoring instru ments should be defensible, repeatable, flexible, and efficient (Christ et al., 2009). Though many are in possession of strong psychometric evidence, traditional rating scales are not intended to be used frequently or to measure small changes in behavior (Chafouleas, Riley-Tillman, & Christ, 2009). Raters retro spectively evaluate behavior at a time that is distant from the time and place the target be havior occurs. Traditional rating scales may not accurately capture state anxiety in a specific setting (e.g., tests, math class) limiting both an understanding of the influence on performance and an evaluation of responsiveness to interven tion (von der Embse, Kilgus, Segool, & Putwain, 2013). In addition, the time and cost re quired of traditional rating scales may limit usability for progress monitoring purposes with large numbers of students (Lane, Bruhn, Eisner,


& Kalberg 2010,). Taken together, these short comings may hinder the validity of traditional rating scales for academic anxiety. Recent research has provided psychometric evidence supporting the use of changesensitive brief behavior rating scales (BBRS) as general outcome measures of social and externalizing behaviors (Chafouleas, Kilgus, Riley-Tillman, Jaffery, & Harrison, 2012; Gresham et al., 2010) and ADHD symptoms (Volpe, Gadow, Blom-Hoffman, & Feinberg, 2009) . BBRS are behavior rating scales that have been adapted and modified for progress monitoring purposes and include a subset of items that (a) are change sensitive and (b) measure specific behaviors (Gresham et al., 2010) . These tools allow educators to con sider construct change rather than assume a static time point will be reliably predictive across time (Herman, Riley-Tillman, & Reinke, 2012; Volpe & Briesch, 2012). BBRS have been employed within intervention stud ies targeting internalizing behaviors (Hunter, Chenier, & Gresham, 2014). Likewise, the Subjective Units of Distress Scale (SUDS) is a common method used within clinical set tings to measure anxiety in response to expo sure treatments; SUDS is typically paired with a “feelings thermometer” as a visual analogue to be used with young children (Kendall et al., 2005). However, these mea sures (1) generally lack the psychometric sup port necessary for widespread adoption spe cific to progress monitoring purposes or (2) have not been validated within clinical set tings (Benjamin et al., 2010). DBR (Chafouleas, Riley-Tillman, & Christ, 2009; Chafouleas, Riley-Tillman, & Sugai, 2007) is one such brief assessment tool to be used in formative behavioral assessment. DBR procedures and instrumentation typi cally include a brief rating of a target behav ior during a prespecified time using a gradient scale (e.g., teacher may rate student disrup tive behavior on a 0-10 scale during after noon independent seat work). DBR single item scales (DBR-SIS) have moderate to high concordance with teacher rating scales (Cha fouleas, Riley-Tillman, & Christ, 2009) and systematic direct observation (Riley-Tillman, Chafouleas, Sassu, Chanese, & Glazer, 2008) for disruptive behaviors, and are sensitive to intervention effects (Chafouleas, Sanetti, et

247

al., 2012). A review of the literature suggests promise for DBR methods as the National Center for Intensive Intervention had identi fied DBR-SIS as possessing convincing evi dence with regard to social and academic behaviors (www.intensiveintervention.org/); con versely, there is little research utilizing DBRs with internalizing behaviors. Previous DBR re lated research has relied on teacher and parent report (Chafouleas, Christ, & Riley-Tillman, 2009; Riley-Tillman et al., 2008). However, self-report may be a more accurate measure of anxiety and related symptomology (Merrell, 2008). Thus, research is needed to examine the psychometric utility of self-report DBR assess ments for internalizing behaviors. Purpose of the Present Study

There is a shortage of research that has evaluated brief measures of internalizing be havior for the purpose of formative assess ment within a multitiered, problem-solving framework. Direct observations can provide repeatable and robust data, but are time con suming and often unrealistic to gather for multiple students (Briesch, Chafouleas, & Ri ley-Tillman, 2010). Due to the challenges as sociated with observational ratings of inter nalizing behaviors, it is increasingly difficult for educators to progress monitor Tier II and group level interventions. Thus, there is a distinct need for efficient assessments that evaluate situation-specific internalizing be havior, particularly for academic anxiety. The DBR-SIS for anxiety holds potential in the evaluation of interventions as they can be repeatedly administered in immediate prox imity to the anxiety provoking stimuli. DBRSISs are more efficient, more appropriate for monitoring intervention effects, and are more likely to be administered by educators than larger multiple-item indicators (Chafouleas, Sanetti, Kilgus, & Maggin, 2012). However, there is limited research evaluating DBR-SIS against the composite score of a rating scale (Volpe & Briesch, 2012). The purpose of this study was to evaluate the sensitivity to change and concurrent validity of DBR-SIS for anx iety before, during, and after a potentially anxiety provoking stimuli (i.e., test). The fol lowing hypotheses were generated: (a) DBRSIS would exhibit moderate to high correla-


248

tions with pre- and postmeasures of anxiety and (b) the degree of change in DBR-SIS during a testing situation would be reflective of the change in pre- and postmeasures of anxiety.

25.2%), six were Latino (N = 5.2%), seven were Asian (N = 6.1%), and seven were of mixed ethnicity (N = 6.1%). Self-reported, cu mulative grade point averages (on a 4-point scale) ranged from 1.2 to 4.0 (M = 2.95, SD = 0.59).

Method Measures Participants College students were selected for the present study given their status as consenting adults, the relative minimal risk of measuring state levels of anxiety during a testing situation, and the potential to evaluate assessment utility with an analogue to a school-aged population. This pop ulation more readily allows for an examination of DBR-SIS of anxiety, particularly within a testing situation, to support future research with school-age children while minimizing risk. Similar participant selection was noted in stud ies examining the technical adequacy of direct behavioral ratings prior to evaluations with mi nors (e.g., Chafouleas, Christ, & Riley-Tillman, 2009; Chafouleas, Kilgus, et al., 2012). Participants were selected from a larger in tervention study (N = 170) examining the con cordance of physiological measures of anxiety (e.g., heart rate variability; HRV) and self monitoring (SM) of anxiety as predictive of test performance. Participants were randomly as signed to one of three treatment groups. The first grouping (HRV + SM) included 57 partic ipants that wore heart rate monitors and com pleted DBR-SIS throughout a 60-min testing session. The second (N = 58; SM only) and third groups (N = 55; HRV only) followed the same procedures exclusively self-monitoring or wearing a heart rate monitor, respectively. A total of 115 undergraduate students (from the SM and HRV + SM groups) were included in the present investigation. Participants were recruited from introductory psychology classes, in which they received research credit toward their final course grade. To be included in this study, participants needed to be at least 18 years of age (M = 19.0 years, SD = 1.30 years). Students who had previously taken the GRE were excluded from participating in the study. The sample consisted of 66 female and 49 male college students. The majority of the sample was White (V = 66, 57.4%) and also included 21 students that were African American (N =

Participants completed a demographic ques tionnaire, a pre- and post-Test Anxiety Inven tory (TAI; Spielberger, 1980), a modified ver sion of the GRE, and a DBR-SIS of anxiety, respectively. TAI. Pre- and postanxiety was measured via the TAI (Spielberger, 1980). The TAI is a widely used, self-report measure of test anxiety that employs a 4-point Likert scale ranging from 1 (almost never) to 4 (almost always). There are 20 total items and scores on the TAI range from 20 to 80 points. The TAI has two subscales including Worry that measures cognitive con cerns about the consequences of failure (e.g., “I feel very nervous during the test”) and Emo tionality that measures physiological responses associated with test anxiety (e.g., “I feel jittery when taking an important test”). Separate norms including means and standard deviations are provided for male and female undergraduates. Cronbach’s alpha for the total score on the TAI was .92 and test-retest reliability was r = .80 (Spielberger, 1980). The TAI exhibits accept able convergent validity with measures of state anxiety and academic performance (Szafranski, Barrera, & Norton, 2012), and is sensitive to change following intervention (Brown et al., 2011). DBR-SIS. The DBR-SIS consisted of three Likert-scale items ranging from 1 to 10, with 1 indicating no anxiety and 10 indicating very high anxiety (see Appendix). Research has supported the use of DBR-SISs as an efficient method for external ratings of global behaviors (Christ, Riley-Tillman, & Chafouleas, 2009). However, others have suggested that multipleitem scales may accelerate decision making, and enable intervention specific to a pattern of behaviors (e.g., anxiety; Volpe & Briesch, 2012). Therefore, the performance of items in the present study was compared. DBR-SIS items reflected a three-factor, biopsychosocial model of test anxiety (Friedman & BendasJacobs, 1997; Lowe et al., 2008; Segool, von


der Embse, Mata, & Gallant, 2014); items were written to reflect social (DBR-S), cognitive (DBR-C), and physiological (DBR-P) aspects of test anxiety. DBR-SIS items were compared individually, rather than a sum or total score (e.g., a DBR-MIS approach), to identify the most salient item in measuring anxiety. Typi cally, DBR-SIS ratings are on a graphic scale of 0 to 10 whereas DBR-MIS ratings utilize a 1 to 3 or 1 to 5 Likert scale. Content validation procedures were con ducted by (1) reviewing the relevant internaliz ing behavior literature and existing assessments, (2) developing an initial pool of 21 potential items, and (3) contacting six national experts in behavioral and psychological assessment design (Haynes, Richard, & Kubany, 1995). Experts rated items on a scale of 1 to 5 with 1 indicating no relevance/readability and 5 indicating the most relevance/readability to the proposed test anxiety theory (biopsychosocial theory; Lowe et al., 2008). Based on this feedback, a pool of 21 potential items was reduced to three items for inclusion on the DBR-SIS. Modified GRE General Test. The GRERevised General Test is a standardized test of academic performance taken by prospective graduate school applicants and measures three academic areas: verbal reasoning, quantitative reasoning, and analytical writing. The verbal and quantitative reasoning sections were se lected for the current investigation, while the analytical writing section was not included. The verbal reasoning portion of the GRE measures the ability to evaluate written material and an alyze relationships among concepts. The quan titative reasoning section measures problem solving ability while focusing on basic concepts including algebra, geometry, and data analysis. Procedures The university Institutional Review Board approved all study procedures. Each participant was assigned a unique code to protect confiden tiality. Participants were informed of the oppor tunity to obtain a $300 gift card to a national retail store based on overall test performance. For each correct test item, the odds of obtaining the gift card increased (i.e., those participants with the highest test scores had the most chances to win the gift card). The incentive was used to emphasize the importance of test per

249

formance relative to a desired outcome as an analogue of typical high-stakes testing in school settings. It was presumed this incentive would then translate to increased baseline anxiety for at least a subset of individuals, thus permitting (a) an opportunity for SM to act on elevated anxiety and (b) an evaluation of DBR-SIS sen sitivity to change in anxiety from baseline levels in response to SM. As part of the larger inter vention study, participant test performance was compared across three conditions (HRV, SM, HRV + SM). Study personnel distributed consent forms, demographic forms, and the first TAI. Each participant completed a baseline rating of the DBR-SIS. Next, participants were informed they had 30 min to complete each section of the GRE (i.e., verbal and quantitative). After 30 min, participants were reminded of the impor tance of test performance to obtain the incentive and were instructed to complete the next GRE section. GRE verbal and quantitative sections were randomly presented to prevent order ef fects. Study personnel instructed participants to complete the DBR-SIS at regular 10-min inter vals throughout the 60-min testing session for a total of seven DBR-SIS ratings (i.e., baseline DBR-SIS + six DBR-SIS ratings at 10-min intervals). On completion of the GRE, a final TAI was administered. Data Analysis Pearson product-moment correlation coeffi cients (r) were computed in examining both DBR-SIS concurrent criterion-related validity and sensitivity to change. To permit each of these analyses, it was necessary to first organize and aggregate TAI and DBR-SIS data. Please see the following for additional information re garding data organization and analysis. Concurrent validity. Pretest and posttest TAI ratings were each considered as part of concurrent validity analyses. Also included in the analyses were DBR data collected at pretest (i.e., Time 1), DBR data collected at the final time point (i.e., Time 7 during testing), and the arithmetic mean of all DBR data collected dur ing testing (= [DBR2 + DBR3 + DBR4 + DBR5 + DBR6 + DBR7J/6). Pretest TAI and pretest DBR data were compared via correla tions in examining the concurrent validity of the DBR scales as indicators of academic anxiety

250


prior to a testing situation. Posttest TAI ratings were compared to both final DBR and mean DBR data via correlations in evaluating the concurrent validity of the DBR scales as indi cators of academic anxiety subsequent to a test ing situation. The decision to consider both final DBR and mean DBR reflected the intention to determine whether posttest academic anxiety, as measured by the TAI, was better predicted by (a) self-rated anxiety throughout the testing sit uation or (b) self-rated anxiety immediately fol lowing the testing situation. Concurrent validity analyses included only those participants in the HRV + SM and SM groups as it was only of interest to consider students who had partici pated in the intervention condition as part of the larger intervention study. Sensitivity to change. Next, and in accor dance with Chafouleas, Sanetti, and col leagues’ (2012) procedures, it was necessary to compute and compare change metrics in evaluating DBR sensitivity to change. The TAI items were summed with a possible range of 20 to 80 points and then compared to the DBR-SIS with a possible range of one to 10 points. Two change metrics were com puted for both TAI and DBR data: absolute change scores and percentage change scores. TAI absolute change scores were computed for each student by subtracting his or her pretest TAI score from his or her posttest TAI score (= TAIpost - TAIpre). DBR absolute change scores were computed by subtracting a student’s pretest DBR score from his or her DBR mean score (as described earlier; = DBRmean - DBRpre). TAI percentage change scores were then computed for each student by dividing his or her TAI absolute change score by his or her pretest TAI score ( = [TAIpost - TAIpre]/TAIpre). DBR percentage change scores were computed by dividing a student’s absolute change DBR score by his or her pretest DBR score (= [DBRmean DBRpre]/DBRpre). It is noted that the decision to calculate change scores using DBR mean scores, as opposed to DBR scores collected at the final time point (Time 7) was made in recognition of both (a) previous DBR sensi tivity to change studies, which also have con sidered mean DBR scores (Chafouleas, Sa netti, et al., 2012); and (b) correlational findings, which were indicative of the high correlation between mean and final DBR

scores and thus their potential interchangeability. The decision to consider multiple-change metrics was in the interest of determining the extent to which the current sensitivity to change findings were related to the specific statistic chosen for consideration. Differing results across statistics might suggest the lim ited statistical conclusion validity of the find ings, whereas consistent results would sup port the robustness of findings and the validity of claims regarding DBR sensitivity to change. Similar to concurrent validity anal yses, sensitivity to change analyses only in cluded those students in the HRV + SM and SM groupings, within which changes in aca demic anxiety were expected in response to intervention. To adjust for the inflated family wise Type I error rate resulting from the calculation of numerous correlation coefficients, Bonferroni corrections were applied to all pairwise crit ical p values. As a total of 57 unique nonredundant correlations were calculated across all validity and sensitivity to change analyses, the critical p value for each pairwise compar ison was adjusted to .001 (.05/57 = .001). To note, it was of interest to determine the extent to which correlational findings differed across intervention conditions (i.e., HRV, SM, and HRV + SM), which are described in the Proce dures section. As such, all analyses were con ducted both across all groups at the aggregate level, as well as within each individual group (with the exception of sensitivity to change anal yses, which were only conducted in the SM and HRV + SM group). Overall, the patterns of find ings were found to be nearly identical across all analyses, leading to consistent conclusions regard ing DBR-SIS concurrent validity and sensitivity to change. As such, in the interest of parsimony, we only report general findings at the aggregate level across all groups. Results Descriptive Statistics See Table 1 for an overview of descriptive statistics. Overall, the majority of variables demonstrated a relatively normal distribution (i.e., skewness and kurtosis < ± 2). This is with the noted exception of the absolute and


251

Table 1 Descriptive Statistics fo r the Test Anxiety Inventory and D irect Behavior Ratings o f Anxiety

Pretest TAI-W TAI-E TAI-T Posttest TAI-W TAI-E TAI-T Pretest DBR-C DBR-S DBR-P Final DBR-C DBR-S DBR-P M DBR-C DBR-S DBR-P Absolute change TAI-W TAI-E TAI-T DBR-C DBR-S DBR-P % change TAI-W TAI-E TAI-T DBR-C DBR-S DBR-P

M

SD

Min

Max

Skewness

Kurtosis

16.44 18.20 43.35

5.44 5.61 12.57

8.00 9.00 23.00

29.00 31.00 72.00

0.69 0.39 0.44

-0.45 -0.62 -0.66

16.70 17.94 43.35

5.48 5.23 12.00

8.00 8.00 23.00

31.00 32.00 77.00

0.59 0.26 0.42

-0.19 -0.37 -0.27

3.05 2.13 2.76

1.86 1.69 1.92

1.00 1.00 1.00

7.00 8.00 10.00

0.65 1.66 1.12

-0.61 2.08 0.99

2.36 2.22 3.57

1.83 1.82 2.53

1.00 1.00 1.00

8.00 8.00 10.00

1.32 1.46 0.92

0.89 1.19 0.06

2.68 2.23 3.54

1.87 1.65 2.15

1.00 1.00 1.00

7.83 7.50 9.83

0.98 1.21 0.71

-0.13 0.31 -0.23

0.26 -0.26 0.00 -0.37 0.10 0.78

2.39 2.83 5.45 1.36 1.17 1.47

-6.00 -16.00 -28.00 -5.50 -3.50 -3.33

7.00 9.00 14.00 3.67 4.33 5.00

-0.04 -1.09 -1.10 0.02 0.47 0.47

0.48 8.66 6.25 2.77 2.52 0.52

0.03 0.00 0.01 -0.02 0.19 0.49

0.15 0.15 0.12 0.63 0.74 0.85

-0.33 -0.55 -0.48 -0.79 -0.71 -0.56

0.50 0.75 0.50 3.67 4.33 5.00

0.44 1.09 0.33 3.60 3.31 2.36

0.60 6.09 4.10 17.60 13.65 7.77

Note. TAI-W = Test Anxiety Inventory Worry subscale; TAI-E = Test Anxiety Inventory Emotionality subscale; TAI-T = Test Anxiety Inventory Total scale; DBR-C = Direct Behavior Rating-Cognitive type anxiety; DBR-S = Direct Behavior Rating-Social type anxiety; DBR-P = Direct Behavior Rating-Physiological type anxiety.

percentage change metrics, several of which were positively skewed and leptokurtic. These findings suggest one should use caution when interpreting the current results, as cor relational estimates may be somewhat biased in the presence of nonnormal data. Variable transformation was considered to eliminate such bias, but was ultimately rejected given (a) the relatively limited number of nonnor mal variables and (b) the desire to preserve the interpretability of findings. Use of nonparam etric correlation coefficients (i.e., Spearman’s p) also was considered, but was rejected given the first of the reasons noted.

Concurrent Validity As described, Pearson correlation coefficients were calculated in evaluating the concurrent va lidity of DBR scales. See Table 2 for a summary of findings. Results suggested all correlation co efficients were statistically significant at either the p < .05 or < .01 levels, whereas 20 of the 27 (74%) were significant at the Bonferroni-adjusted p < -001 level. During pretest, correlations fell in the small (>.10) and medium (>.30; Cohen, 1992) ranges, ranging between .22 and .47. Sim ilar findings were noted for comparisons between posttest TAI and both mean DBR and final DBR

VON DER EMBSE, SCOTT. AND KILGUS

252

Table 2 Pearson Product-Moment Correlations Between DBR-SIS and TAI Scales Post

Pre

DBR-C DBR-S DBR-P DBR-C DBR-S DBR-P DBR-C DBR-S DBR-P

TAI-W

TAI-E

TAI-T

.41* .27 .41*

.41* .22 .39*

.47* .30* .45*

pretest pretest pretest final final final M M M

TAI-W

TAI-E

TAI-T

.34* .26 .24 .52* .38* .43*

.40* .23 .30* .54* .36* .48*

.41* .29 .29 .59* .42* .49*

Note. Table includes heart rate variability (HRV) + self-monitoring (SM) and SM only groups (n = 115). DBR-SIS = Direct Behavior Rating-Single-Item Scale; TAI = Test Anxiety Inventory; TAI-W = Test Anxiety Inventory Worry subscale; TAI-E = Test Anxiety Inventory Emotionality subscale; TAI-T = Test Anxiety Inventory Total scale; DBR-C = Direct Behavior Rating-Cognitive type anxiety; DBR-S = Direct Behavior Rating-Social type anxiety; DBR-P = Direct Behavior Rating-Physiological type anxiety. > < . 001 .

data, with correlations falling in the small, me dium, and large (>.50) ranges; specifically, rang ing between .36 and .59 for TAI-mean DBR and .23 and .41 for TAI-final DBR. On average, cor relations were higher between the posttest TAI and mean DBR variables (M r = .47) relative to correlations between the posttest TAI and final DBR variables (M r = .31) and pretest TAI and pretest DBR (M r = .37). Furthermore, correla tions were higher for the DBR-C and DBR-P scales relative to the DBR-S.

Sensitivity to Change Correlations were also calculated in comparing change metrics across TAI and DBR scales. See Table 3 for an overview of findings for absolute change metrics and Table 4 for findings regarding percentage change metrics. Correlations were pre dominantly nonstatistically significant, with the exception of the correlations between percentage change metrics for DBR-C and TAI-E (r = .22, p < .05) and DBR-C and TA I-T (r = .20, p
< . 001.

.05). Taken together, these nonsignificant results indicated that changes in anxiety, as evaluated via the TAI scales, were generally not consistent with changes evaluated via DBR scales. Change metric descriptive statistics were fur ther reviewed in an effort to elucidate the potential cause of such noncorrespondence across mea sures. Findings were suggestive of differences across TAI and DBR in regard to within-person change in response to self-monitoring. Relative to the TAI, DBR percentage change statistics dem onstrated notably higher means, standard devia tions, and ranges (= Max-Min), particularly the DBR-S and DBR-P variables. This indicates that DBR scales demonstrated greater change across the testing situation, and that the level of change in DBR scales varied more widely across partici pants. Discussion Modem school-based mental health service delivery is predicated on the use of efficient and defensible assessments and interventions across a problem-solving framework. To this end, re search support for the suitability of current mul titiered frameworks (e.g., Rtl, PBIS) with men tal health problems, specifically internalizing behaviors, is equivocal (McIntosh et al., 2013). Research is needed to develop novel interven tion and assessment solutions to be employed across universal, targeted, and intensive tiers. Although evidence has accumulated for univer sal screening procedures (Severson, Walker,

Hope-Doolittle, Kratochwill, & Gresham, 2007), there remains a gap in the literature regarding progress monitoring tools for inter nalizing behaviors. As noted by Christ et al. (2009), behavioral and socioemotional progress monitoring assessments should be defensible, flexible, repeatable, and efficient. The purpose of the present investigation was to evaluate one such progress monitoring assessment, the DBR-SIS for anxiety, relative to the established criteria. Overall, results supported the concurrent va lidity of DBR-C, DBR-S, and DBR-P of anx iety with the TAI. DBR-SISs demonstrated sig nificant, albeit moderate, correspondence with pre- and posttest measures of anxiety. In partic ular, DBR-C exhibited the strongest relation ship with the TAI and the DBR-S exhibited the weakest relationship. Previous research indi cated that cognitive type anxiety is the strongest negative predictor of test performance whereas social anxiety may positively influence perfor mance (von der Embse & Witmer, 2014). Pend ing future research, DBR-C is a promising al ternative in the measurement of anxiety directly before and after a testing situation. Secondary analyses evaluated the sensitivity to change of the DBR-SIS with the TAI. Re search by Chafouleas, Sanetti, Kilgus, & Maggin (2012) indicated that DBR-SISs for exter nalizing behaviors were sensitive to change regardless of the metric used. Results from the present investigation indicated noncorrespon dence between change metrics for the DBRs

254


reported “high” levels of anxiety on the TAI prior to completing the GRE. With generally low levels of initial anxiety, there was limited opportunity to evaluate construct change. Fu ture research investigations could screen partic ipants at risk for academic anxiety providing a greater opportunity to evaluate sensitivity to change specific to a highly anxious sample. In addition, researchers may consider the use of alternative incentives (e.g., increased number of monetary awards, extra credit), presenting test stimuli (e.g., practice test, authentic test), or performance consequences (e.g., public report ing of test scores, competition for best perfor mance). Third, frequent self-reporting of internalizing symptoms may have a reactive effect (Shapiro & Cole, 1999) reducing overall anxiety. Thus, it is difficult to ascertain whether the DBR-SIS effectively measured or contributed to the change in anxiety before, during, and after the testing session. Previous research had used ret rospective measurement approaches to capture changes in anxiety during test taking (Schutz, Distefano, Benson, & Davis, 2004). Although sensitive to change following multiweek clini cal interventions (Brown et al., 2011), the TAI Limitations and Future Research has not previously been used to measure anxiety The present investigation is not without lim both directly before and after an examination. itations. Although similar in age to high school Further, it is not known how completing the students, the participants in the current study TAI before the DBR-SIS influences responses were drawn from one university in the South (i.e., completing the TAI may have primed an eastern United States. Caution should be exer anxious rating on the DBR). The DBR-SIS can cised when generalizing the results to school- be easily completed directly by students in most age populations. Future research should school settings and therefore may be more sen incorporate larger and more diverse samples sitive to change when compared to traditional with school aged participants. Second, a mone rating scales; however, the greater sensitivity to tary incentive dependent on GRE test perfor change exhibited by the DBR may be attributed mance was used as an analogue to high-stakes to the unreliability of single-item measurement. tests within schools. It was not known whether Research is needed to further validate DBR the incentive was effective in prompting partic assessments using self-report methodology, ipants to take the testing situation seriously. particularly with a school-aged population. In Test anxiety may be more closely aligned with addition, future research should consider mul fear of failure (e.g., denial of graduation, col tiple-criterion related measures and metrics to lege admission) rather than obtainment of a further evaluate sensitivity to change (Chareward or inventive however, concern over so fouleas, Sanetti, et al., 2012). As mental health assessment evolves, re cial consequences (e.g., parental or educator approval) may improve test performance (von searchers should continue to assess analytical der Embse & Witmer, 2014). The influence of options for progress monitoring development. an incentive, rather than punishment, relative to Emerging research has suggested an evaluation the variability in anxiety throughout the testing of base rates of mental health problems (e.g., session is unknown. Descriptive statistics indi anxiety) to reach the most efficient and accurate cated that approximately 30% of the sample intervention decisions (VanDerHeyden, 2013).

and TAI; the DBRs demonstrated notably greater change throughout the test. DBRs may be a better measure of within-person change in anxiety across the testing situation; however, single item scales may be more susceptible to random measurement error when compared with multi-item rating scales. Sensitivity to change may allow for (a) a more precise depic tion of the varying nature of anxiety relative to test content difficulty and (b) greater frequency and accuracy in evaluating response to interven tion. Caution is urged when interpreting change given the potentially reactive nature of self monitoring on anxiety (Shapiro & Cole, 1999). Taken together, results from this study highlight the potential of DBR-SIS with internalizing behaviors as a flexible, efficient, and repeatable assessment. For example, DBR-SIS may be used to evaluate the effectiveness of Tier II interventions for anxiety (e.g., progressive mus cle relaxation, reattribution training) throughout an anxiety-provoking situation. Future research is necessary to evaluate the psychometric defensibility of DBRs with internalizing behaviors using multiple-criterion measures.


The expected degree of change (i.e., sensitivity to change) may be dependent on initial levels and base rates of the target behavior. For exam ple, omnibus rating scale methods may not be effective in capturing intervention effectiveness for unstable (i.e., state anxiety) or low base rate mental health concerns. These measurement concerns underscore the importance of develop ing progress monitoring tools that are both flex ible and efficient. Although useful in formative assessment frameworks, DBR-SIS needs addi tional research to support psychometric defensibility due to the complexities of contextual variability and comorbidity inherent with inter nalizing behaviors. In addition, future research will be necessary to delineate recommendations for use of the DBR-SIS across various assess ment purposes. Implications for Practice DBR-SIS of Anxiety Within a Problem-Solving Framework

Although historically considered beyond the school’s purview (Chafouleas, Kilgus, & Wallach, 2010), mental health service deliv ery is receiving increased attention within the research and practice domains. Therefore, re search is needed to evaluate the usability of assessments across all three tiers, and for the purposes of screening, progress monitoring and program evaluation. The availability of brief and psychometrically defensible mea sures that allow for intervention evaluation is paramount when determining student prog ress and responsiveness in tiered models of service delivery (Chafouleas et al., 2007). Within targeted intervention settings, deci sions must be made both for individual stu dents and across groups of students (Ander son, Turtura, & Parry, 2013). Educators must decide whether a student would benefit from a Tier II intervention and which intervention is most appropriate for that student. Successful Tier II interventions rely on cost-efficient as sessments that are sensitive to repeated mea surements and responsive to intervention (Herman et ah, 2012). The DBR-SISs have demonstrated promise as a brief assessment of social and external izing behaviors (Chafouleas, Christ, & RileyTillman, 2009; Chafouleas, Kilgus, et ah,

255

2012). Results from the present investigation provide preliminary evidence of the repeat ability, flexibility, and efficiency of the DBRSIS for internalizing behaviors. To be used with internalizing behaviors, assessments need to be sensitive to the context in which anxiety may be exhibited. For example, SUDS level assessments have been used within clinical settings to determine respon siveness to anxiety-targeted interventions, such as cognitive-behavioral therapy (Benja min et ah, 2010). In that regard, the DBR-SIS can be integrated within existing multitiered frameworks at the Tier II level for group intervention evaluation. DBR-SIS data may be used to examine responsiveness to a Tier II intervention for students with academic anx iety. Research will be necessary to determine the required number of ratings to obtain both a reliable estimate of behavior and evaluate responsiveness to intervention (Chafouleas, Christ, & Riley-Tillman, 2009). DBR-SIS for internalizing behaviors can be used as both an assessment and behavioral intervention. As noted earlier, the very act of repeated self-reporting (i.e., self-monitoring) may reduce anxiety (Shapiro & Cole, 1999). Self-monitoring has been effective for reduc ing numerous problem behaviors, including academic skill problems, off-task, and disrup tive behavior with children (Briesch & Cha fouleas, 2009; Sheffield & Waller, 2010), as well as internalizing symptoms of depression and anxiety in adults (Febbraro & Clum, 1998). Self-monitoring includes the observa tion and recording of one’s behavior (Mace, Belfiore, & Hutchinson, 2001) and promotes maintenance by teaching skills such as inde pendence and self-reliance (Briesch & Cha fouleas, 2009; Shapiro & Cole, 1994). Re garding internalizing behaviors, SM may be sufficient to promote behavior change through negative reinforcement (i.e., removal of aver sive arousal) and recognition of anxiety pro voking triggers. Students in targeted interven tion settings may be taught to monitor their own anxiety using a DBR-SIS, reducing re liance on support staff. In addition, SM can be implemented in combination with more inten sive interventions (e.g., systematic desensiti zation, exposure-based techniques) for more severe cases of anxiety.


256

Feasibility Given the pervasiveness of academic anxiety, it is important to identify measures that are efficient, easily implemented for a large number of students, and sensitive to intervention effec tiveness (von der Embse, Barterian, & Segool, 2013). Assessment within Tier II service deliv ery typically requires fewer resources (e.g., time, effort, cost) than does intensive, individ ualized assessment at Tier III. DBR-SIS can provide contextually specific data to define the problem, monitor intervention effectiveness, and guide eligibility and diagnostic decisions (Christ et al., 2009). Traditional rating scale assessment may not be contextually appropriate for internalizing concerns that are situational specific. As such, research is needed to examine the number of ratings and DBR items necessary to reach an efficient and reliable decision when compared with omnibus rating scales or DBR multi-item scales (Volpe & Briesch, 2012). Ad ditional investigations could evaluate the effi ciency of a DBR approach by comparing the reliability of a single rating versus multiple rat ings of anxiety across time (days or weeks). Educators should consider assessments of inter nalizing behaviors that are (a) sensitive to the presenting stimuli, (b) repeatable across time, (c) psychometrically sound, and (d) efficient. The DBR-SIS requires fairly minimal time for training and rating of relevant internalizing be haviors. Finally, DBR-SISs of anxiety are a cost-effective alternative to time and cost intensive omnibus rating scales.

References Akin-Little, K. A., Little, S. G., Bray, M. A., & Kehle, T. J. (2009). Behavioral interventions in schools: Evidence-based positive strategies. Washington, DC: American Psychological Associ ation. American Psychiatric Association. (2013). Diagnos tic and statistical manual o f mental disorders (5th ed.). Arlington, VA: American Psychiatric Pub lishing. Anderson, C. M., Turtura, J., & Parry. M. (2013). Addressing instructional avoidance with Tier II supports. Journal o f Applied School Psychology, 29, 167-182. doi: 10.1080/15377903.2013.778772 Benjamin, C. L., O’Neil, K. A., Crawley, S. A., Beidas, R. S., Coles, M., & Kendall, P. C. (2010). Patterns and predictors of subjective units of dis

tress in anxious youth. Behavioural and cognitive psychotherapy, 38, 497-504. Bradshaw, C. P., Waasdorp, T. E., & Leaf, P. J. (2012). Effects of school-wide positive behavioral interventions and supports on child behavior prob lems and adjustment. Pediatrics, 130, el 136e 1145. doi: 10.1542/peds.2012-0243 Briesch, A., & Chafouleas, S. (2009). Review and analysis of literature on self-management interven tions to promote appropriate classroom behaviors (1988-2008). School Psychology Quarterly, 24, 106-118. doi:10.1037/a0016159 Briesch, A. M., Chafouleas, S. M., & Riley-Tillman, T. C. (2010). Generalizability and dependability of behavior assessment methods to estimate academic engagement: A comparison of systematic direct observation and direct behavior rating. School Psy chology Review, 39, 408-421. Brown, L. A., Forman, E. M., Herbert, J. D., Hoff man, K. L., Yuen, E. K., & Goetter, E. M. (2011). A randomized controlled trial of acceptance-based behavior therapy and cognitive therapy for test anxiety: A pilot study. Behavior Modification, 35, 31-53. doi:10.1177/0145445510390930 Camevale, T. D. (2013). Universal adolescent de pression prevention programs: A review. The Journal o f School Nursing, 29, 181-195. doi: 10.1177/1059840512469231 Cartwright-Hatton, S., McNicol, K., & Doubleday, E. (2006). Anxiety in a neglected population: Preva lence of anxiety disorders in pre-adolescent chil dren. Clinical Psychology Review, 26, 817-833. doi: 10.1016/j .cpr/2005.12.002 Cassady, J. C. (2010). Anxiety in schools. New York, NY: Lang. Chafouleas, S. M., Christ, T. J., & Riley-Tillman, T. C. (2009). Generalizability and dependability of scaling gradients on direct behavior ratings. Edu cational and Psychological Measurement, 69, 157-173. doi: 10.1177/0013164408322005 Chafouleas, S. M., Kilgus, S. P., Riley-Tillman, T. C., Jaffery, R., & Harrison, S. (2012). Prelimi nary evaluation of various training components on accuracy of direct behavior ratings. Journal o f School Psychology, 50, 317-334. doi: 10.1016/j.jsp .2011.11.007 Chafouleas, S. M., Kilgus, S. P., & Wallach, N. (2010). Ethical dilemmas in school-based behav ioral screening. Assessment fo r Effective Interven tion, 35, 245-252. doi: 10.1177/153450841 0379002 Chafouleas, S. M., Riley-Tillman, T., & Christ, T. J. (2009). Direct Behavior Rating (DBR): An emerg ing method for assessing social behavior within a tiered intervention system. Assessment fo r Effec tive Intervention, 34, 195-200. doi: 10.1177/ 1534508409340391


Chafouleas, S. M., Riley-Tillman, T. C., & Sugai, G. (2007). School-based behavior assessment and monitoring. New York, NY: Guilford Press. Chafouleas, S. M., Sanetti, L. M. H., Kilgus, S. P., & Maggin, D. M. (2012). Evaluating sensitivity to behavioral change across consultation cases using direct behavior rating single-item scales (DBRSIS). Exceptional Children, 78, 491-505. Christ, T. J., Riley-Tillman, T. C., & Chafouleas, S. M. (2009). Foundation for the development and use of Direct Behavior Rating (DBR) to assess and evaluate student behavior. Assessment fo r Effective Intervention, 34, 201-213. doi: 10.1177/153450 8409340390 Cizek, G., & Burg, S. (2006). Addressing test anxiety in a high stakes environment. Thousand Oaks, CA: Corwin Press. Cohen, J. (1992). A power primer. Psychological Bulletin, 112, 155. Cook, C. R., Rasetshwane, K. B., Truelson, E., Grant, S., Dart, E. H., Collins, T. A., & Sprague, J. (2011). Development and validation of the Student Internalizing Behavior Screener: Examination of reliability, validity, and classification accuracy. Assessment fo r Effective Intervention, 36, 71-79. doi: 10.1177/1534508410390486 Davis, A. S., Kruczek, T„ & Mcintosh, D. E. (2006). Understanding and treating psychopathology in schools: Introduction to the special issue. Psychol ogy in the Schools, 43, 413-417. doi: 10.1002/pits .20155 Donovan, C. L., & Spence, S. H. (2000). Prevention of childhood anxiety disorders. Clinical Psychol ogy Review, 20, 509-531. doi: 10.1016/S02727358(99)00040-9 Febbraro, G. R„ & Clum, G. A. (1998). Meta-analytic investigation of the effectiveness of self-regulatory components in the treatment of adult problem behav iors. Clinical Psychology Review, 18, 143-161. doi: 10.1016/S0272-7358(97)00008-1 Friedman, I. A., & Bendas-Jacob, O. (1997). Mea suring perceived test anxiety in adolescents: A self-report scale. Educational and Psychological M easurement, 57, 1035-1046. doi: 10.1177/ 0013164497057006012 Goodman, R. (1997). The Strengths and Difficulties Questionnaire: A research note. Journal o f Child Psychology & Psychiatrv & Allied Disciplines, 38, 581-586. doi: 10.1111 /j. 1469-7610.1997.tbOl 545.x Gregor, A. (2005). Examination anxiety: Live with it, control it or make it work for you? School Psy chology International, 26, 617-635. doi: 10.1177/ 0143034305060802 Gresham, F. M„ Cook, C. R., Collins, T., Dart, E., Rasethwane, K., Truelson, T., & Grant, S. (2010). Developing a change-sensitive brief behavior rat ing scale as a progress monitoring tool for social

257

behavior: An example using the Social Skills Rat ing System-Teacher Form. School Psychology Re view, 39, 364-379. Gresham, F. M., & Kern, L. (2004). Internalizing behavior problems in children and adolescents. In R. B. Rutherford, M. M. Quinn, & S. R. Mathur (Eds.), Handbook o f research in emotional and behavioral disorders (pp. 262-281). New York, NY: Guilford Press. Haynes, S. N„ Richard, D. C. S., & Kubany, E. S. (1995). Content validity in psychological assess ment: A functional approach to concepts and meth ods. Psychological Assessment, 7, 238-247. doi: 10.1037/1040-3590.7.3.238 Hembree, R. (1988). Correlates, causes, effects, and treatment of test anxiety. Review o f Educational Research, 58, 47-77. doi:10.3102/0034654305 8001047 Hembree, R. (1990). The nature, effects, and relief of mathematics anxiety. Journal fo r Research in Mathematics Education, 21, 33-46. doi: 10.2307/ 749455 Herman, K. C., Riley-Tillman, T. C., & Reinke, W. M. (2012). The role of assessment in a preven tion science framework. School Psychology Re view, 41, 306-314. Hoagwood, K„ & Johnson, J. (2003). School psy chology: A public health framework: I. From ev idence-based practices to evidence-based policies. Journal o f School Psychology, 41, 3-21. doi: 10.1016/S0022-4405(02)00141 -3 Huang, L., Stroul, B., Friedman, R., Mrazek, P., Friesen, B., Pires, S., & Mayberg, S. (2005). Transforming mental health care for children and their families. American Psychologist, 60, 615627. doi: 10.1037/0003-066X.60.6.615 Hunter, K. K., Chenier, J. S., & Gresham, F. M. (2014). Evaluation of check in/check out for stu dents with internalizing behavior problems. Jour nal o f Emotional and Behavioral Disorders, 22, 135- 148. doi: 10.1177/1063426613476091 Individuals With Disabilities Education Improve ment Act of 2004 (IDEA), Pub. L. No. 108-446, 118 Stat. 2647 (2004). Kalberg, J. R„ Lane, K. L., Driscoll, S., & Wehby, J. (2011). Systematic screening for emotional and behavioral disorders at the high school level: A formidable and necessary task. Remedial and Spe cial Education, 32, 5 0 6 -5 2 0 . doi: 10.1177/ 0741932510362508 Kamphaus, R. W., & Reynolds, C. R. (2007). BASC-2: Behavioral and Emotional Screening System. Minneapolis, MN: Pearson. Kendall, P. C., Robin, J., Hedtke, K., Suveg, C., Flannery-Schroeder, E., and Gosch, E. (2005). Considering CBT with anxious youth? Think ex posures. Cognitive and Behavioral Practice, 12, 1 3 6 - 150.

258


Lane, K., Bruhn, A. L., Eisner, S. L., & Kalberg, J. (2010). Score reliability and validity of the Student Risk Screening Scale: A psychometrically sound, feasible tool for use in urban middle schools. Jour nal o f Emotional and Behavioral Disorders, 18, 211-224. doi: 10.1177/1063426609349733 Lowe, P. A., Lee, S. W„ Witteborg, K. M., Prichard, K. W„ Luhr, M. E„ Cullinan, C. M„ & Janik, M. (2008). The Test Anxiety Inventory for Children and Adolescents (TAICA): Examination of the psychometric properties of a new multidimen sional measure of test anxiety among elementary and secondary school students. Journal o f Psychoeducational Assessment, 26, 215-230. doi: 10.1177/0734282907303760 Mace, F. C., Belfiore, P. J., & Hutchinson, J. M. (2001). Operant theory and research on self regulation. In B. Zimmerman & D. Schunk (Eds.), Self-regulated learning and academic achieve ment: Theoretical perspectives (pp. 39-65). Mahwah, NJ: Erlbaum. Mayer, J. D., Roberts, R. D., & Barsade, S. G. (2008). Human abilities: Emotional intelligence. Annual Review o f Psychology, 59, 507-536. McIntosh, K„ Ty, S. V., & Miller, L. D. (2013). Effects of school-wide positive behavior support on internalizing problems: Current evidence and future directions. Journal o f Positive Behavior In terventions. Advance online publication, doi: 10.1177/1098300713491980 Merikangas, K. R., He, J. P., Burstein, M., Swanson, S. A., Avenevoli, S., Cui, L., & Swendsen, J. (2010). Lifetime prevalence of mental disorders in U.S. adolescents: Results from the National Co morbidity Survey-Adolescent Supplement (NCSA). Journal o f the American Academy o f Child & Adolescent Psychiatry, 49, 980-989. doi: 10.1016/ j.jaac.2010.05.017 Merrell, K. W. (2008). Helping students overcome depression and anxiety: A practical guide (2nd ed.). New York, NY: Guilford Press. Merrell, K. W., & Gueldner, B. A. (2010). Preven tative interventions for students with internalizing disorders: Effective strategies for promoting men tal health in schools. In M. R. Shinn & H. M. Walker (Eds.), Interventions fo r achievement and behavior in a three-tier model including RTI (3rd ed., pp. 729-823). Bethesda, MD: National Asso ciation of School Psychologists. Miller, L. D., Shumka, E., & Baker, H. (2012). Spe cial applications: A review of cognitive behavioral mental health interventions for children in clinical and school-based settings. In S. A. Lee & D. M. Edget (Eds.), Cognitive behavioral therapy: Appli cations, methods and outcomes (pp. 1-36). Hauppauge, NY: Nova Science. Putwain, D. W. (2007). Test anxiety in UK schoolchildren: Prevalence and demographic patterns.

British Journal o f Educational Psychology, 77, 579-593. doi: 10.1348/000709906X161704 ' Rapee, R. M., Kennedy, S., Ingram, M., Edwards, S., & Sweeney, L. (2005). Prevention and early inter vention of anxiety disorders in inhibited preschool children. Journal o f Consulting and Clinical Psy chology, 73, 488-497. doi:10.1037/0022-006X.73 .3.488 Riley-Tillman, T. C.. Bums, M. K., & Gibbons, K. (2013). RTI Applications, Volume 1: Assessment, Analysis, and Decision Making. New York, NY: Guilford Press. Riley-Tillman, T. C., Chafouleas, S. M., Sassu, K. A., Chanese, J. M., & Glazer, A. D. (2008). Examining agreement between Direct Behavior Ratings (DBRs) and systematic direct observation data for on- task and disruptive behavior. Journal o f Positive Behavior Interventions, 10, 136-143. doi: 10.1177/1098300707312542 Schutz, P. A., Distefano, C„ Benson, J., & Davis, H. A. (2004). The emotional regulation during test-taking scale. Anxiety, Stress & Coping, 17, 253-269. doi: 10.1080/10615800410001710861 Segool, N., von der Embse, N., Mata, A., & Gallant, J. (2014). Cognitive behavioral model of test anx iety in a high-stakes context: An exploratory study. School Mental Health, 6, 50-61. doi: 10.1007/

S12310-013-9111-7 Severson, H. H., Walker, H. M., Hope-Doolittle, J., Kratochwill, T. R., & Gresham, F. M. (2007). Proactive, early screening to detect behaviorally at-risk students: Issues, approaches, emerging in novations, and professional practices. Journal o f School Psychology, 45, 193-223. doi: 10.1016/j.jsp .2006.11.003 Shapiro, E., & Cole, C. L. (1994). Behavior change in the classroom: Self-management interventions. New York, NY: Guilford Press. Shapiro, E. S., & Cole, C. L. (1999). Self-monitoring in assessing children’s problems. Psychological Assessment, 11, 448-457. doi: 10.1037/1040-3590 .11.4.448 Sheffield, K„ & Waller, R. (2010). A review of single-case studies utilizing self-monitoring inter ventions to reduce problem classroom behaviors. Beyond Behavior, 19, 7-13. Spielberger, C. D. (1980). Preliminary professional manual fo r the Test Anxiety Inventory (TAI). Palo Alto, CA: Consulting Psychologists Press. Spielberger, C. D., & Vagg, P. (1995). Test anxiety: A transactional process. In C. D. Spielberger & P. Vagg (Eds.) Test anxiety: Theory, assessment, and treatment (pp. 3-14). Washington, DC: Taylor & Francis. Sulkowski, M., Joyce, D. K., & Storch, E. A. (2012). Treating childhood anxiety in schools: Service de livery in a response to intervention paradigm.


Journal o f Child and Family Studies, 21, 938-947. doi: 10.1007/s 10826-011-9553-1 Szafranski, D. D., Barrera, T. L., & Norton, P. J. (2012) . Test anxiety inventory: 30 years later. Anx iety, Stress & Coping: An International Journal, 25, 667-677. doi:10.1080/10615806.2012.663490 Tolan, P. H., & Dodge, K. (2005). Children’s mental health as a primary care and concern: A system for comprehensive support and service. American Psychologist, 60, 601-614. doi:10.1037/0003066X.60.6.601 VanDerHeyden, A. M. (2013). Universal screening may not be for everyone: Using a threshold model as a smarter way to determine risk. School Psy chology Review, 42, 402-414. Volpe, R. J., & Briesch, A. M. (2012). Generalizability and dependability of single-item and multipleitem direct behavior rating scales for engagement and disruptive behavior. School Psychology Re view, 41, 246-261. Volpe, R. J., Gadow, K. D., Blom-Hoffman, J., & Feinberg, A. B. (2009). Factor analytic and indi vidualized approaches to constructing brief mea sures of ADHD behaviors. Journal of Emotional and Behavioral Disorders, 17, 118-128. doi: 10.1177/1063426608323370 von der Embse, N., Barterian, J., & Segool, N. (2013) . Test anxiety interventions for children and adolescents: A systematic review of treatment

259

studies from 2000-2010. Psychology in the Schools, 50, 57-71. doi: 10.1002/pits.21660 von der Embse, N., & Hasson, R. (2012). Test anx iety and high-stakes tests: Implications for educa tors. Preventing School Failure, 56, 180-187. doi: 10.1080/1045988X.2011.633285 von der Embse, N., Kilgus, S. P., Segool, N., & Putwain, D. (2013). Identification and validation of a brief test anxiety screening tool. International Journal of School and Educational Psychology, 1, 246-258. doi: 10.1080/21683603.2013.826152 von der Embse, N., Mata, A., Segool, N., & Scott, E. C. (2014). Latent profile analysis of test anxiety: A pilot study. Journal of Psychoeducational As sessment, 32, 165-172. doi: 10.1177/0734282 913504541 von der Embse, N., & Witmer, S. E. (2014). Highstakes accountability: Student anxiety and largescale testing. Journal of Applied School Psychol ogy, 30, 132-156. doi:10.1080/15377903.2014 .888529 Wolpe, J. (1969). The Practice of Behavior Therapy. New York: Pergamon Press. Wren, D. G., & Benson, J. (2004). Measuring test anxiety in children: Scale development and inter nal construct validation. Anxiety, Stress, and Cop ing, 17, 227-240. doi: 10.1080/10615800412 331292606 Zeidner, M. (1998). Test anxiety: The state of the art. New York, NY: Plenum Press.

Appendix Brief Assessment of Anxiety Instructions: Please rate your responses to the following questions on a scale of 1 to 10 with 1 being no anxiety and 10 being very high anxiety. 1. I am nervous. 1 2

4

5

6

7

8

9

10

2. I am worried what others will think. 1 2 3 4

3

5

6

7

8

9

10

3. I feel restless. 1 2

5

6

7

8

9

10

3

4

Received March 13, 2014 Revision received July 1, 2014 Accepted July 7, 2014 ■

Copyright of School Psychology Quarterly is the property of American Psychological Association and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.

Classes of oppositional-defiant behavior: concurrent and predictive validity.

Validity of family informants' ratings of psychiatric patients: general validity.

Movement Assessment of Children (MAC): validity, reliability, stability and sensitivity to change in typically developing children.

Attentional Control Scale for Children: Factor Structure and Concurrent Validity Among Children and Adolescents Referred for Anxiety Disorders.

Trajectories of change in youth anxiety during cognitive-behavior therapy.

The quick and easy Mongolian Rapid Baby Scale shows good concurrent validity and sensitivity.

The internal consistency and concurrent validity of a Spanish translation of the Child Behavior Checklist.

Improving the validity of global ratings.

Concurrent validity of DACL questioned.

Reliability and validity in binary ratings: areas of common misunderstanding in diagnosis and symptom ratings.

Multi-item direct behavior ratings: Dependability of two levels of assessment specificity.

The Kindergarten Academic and Behavior Readiness Screener: The utility of single-item teacher ratings of kindergarten readiness.

Validity and generalizability of social dance performance ratings.

Direct Ventral Hippocampal-Prefrontal Input Is Required for Anxiety-Related Neural Activity and Behavior.

Concurrent validity of the motor domain of the Vineland Adaptive Behavior Scales.

Validity of therapist self-report ratings of fidelity to evidence-based practices for adolescent behavior problems: correspondence between therapists and observers.

Self- versus parent-ratings of industriousness, affect, and life satisfaction in relation to academic outcomes.

CMI Ratings.

The effect of timing on the validity of student ratings.

Personality change following internet-based cognitive behavior therapy for severe health anxiety.

Criterion validity and sensitivity to change of the Early Rehabilitation Index (ERI): results from a German multi-center study.

Validity and sensitivity to change of the Patient Specific Functional Scale used during rehabilitation following proximal humeral fracture.

Direct Aggression and Generalized Anxiety in Adolescence: Heterogeneity in Development and Intra-Individual Change.

Examination of smoking inflexibility as a mechanism linking anxiety sensitivity and severity of smoking behavior.