International Journal of

Radiation Oncology biology

physics

www.redjournal.org

Education

Radiation Oncology Resident In-Training Examination Sandra S. Hatch, MD, FACR,* Neha Vapiwala, MD,y Seth A. Rosenthal, MD, FACR,z John P. Plastaras, MD, PhD,y Albert L. Blumberg, MD, FACR,x William Small Jr, MD, FACR,jj Matthew J. Wenger, BBA,{ and Marie E. Taylor, MD, FACR,# *University of Texas Medical Branch, Galveston, Texas; yUniversity of Pennsylvania, Philadelphia, Pennsylvania; zDepartment of Radiation Oncology, Sutter Cancer Center, Sacramento, California; x Greater Baltimore Medical Center, Baltimore, Maryland; jjStritch School of Medicine, Loyola University, Chicago, Illinois; {ITPG, Inc., Vienna, Virginia; and #Baptist Memorial Hospital East, Memphis, Tennessee

The American College of Radiology (ACR) has been providing the Radiation Oncology In-Training (TXIT) examination to residents for more than 30 years. The examination is given annually in March, and the program is open and voluntary for each of the 4 years of residency training. It is a valuable resource that offers residents an opportunity to self-evaluate their knowledge and to identify specific areas of deficiency relative to their nationwide peers at the same level of training. It also serves as a tool for program directors to assess the effectiveness of their curriculum as benchmarked against other programs. The ACR welcomes constructive feedback and promotes open, multidirectional communication with residents, educators, and program directors. Questions have arisen over the last several years regarding the test (1), suggesting that additional information, which is provided here, would help to address misconceptions. The ACR Commission on Education’s Skills Assessment Committee is responsible for the resident in-service examination. The examinations are fully validated and rooted in the opinions of experts in each of the 13 sections/panels of radiation oncology: biology; physics; statistics; bone and soft

tissue; breast; central nervous system/eye; gastrointestinal tract; genitourinary tract; gynecology; head, neck, and skin; lung; lymphoma and leukemia; and pediatrics. Each section head directs the composition of the specific panel from contributors with expertise in the topic. Construction of the examination involves multiple levels of peer review and editing throughout the course of a year. The questions are organized by subcategory with the preliminary edits. Subsequently, the section heads and the committee chair then convene for 2 days each fall and collectively review all questions for further editing, eliminating duplicate concepts, and correcting formatting errors. The section heads and the committee chair choose the best questions to represent their panel in the final test instrument. The testing company then reviews, edits, and assembles the examination. As an additional quality initiative starting in 2015, the test will undergo an additional content and accuracy review by the committee chair and committee representatives before publication. The content for the examination is based on the published study guides of the American Board of Radiology (ABR) for clinical oncology, physics, biology, and the oral examination

Reprint requests to: Sandra S. Hatch, MD, FACR, University of Texas Medical Branch, 301 University Blvd, 1.4 John McCullough Bldg, Galveston, TX 77555-0711. E-mail: [email protected] Author titles/roles are as follows. S.S.H.: American College of Radiology (ACR) 2015-2020 Radiation Oncology In-Training (TXIT) Exam Chair; N.V.: ACR Radiation Oncology Council, Education Committee Member and Vice President, Association of Directors of Radiation

Oncology Programs; S.A.R.: Chair, ACR Commission on Radiation Oncology; J.P.P.: ACR Commission on Radiation Oncology; A.L.B.: Immediate Past President, ACR; W.S.: Radiation Oncology Commission Education Chair and CARROS Executive Committee; M.J.W.: Director, Examination, Assessment & Curriculum Development Division, ITPG, Inc.; M.E.T.: ACR TXIT Exam Chair Ex-Officio. Conflict of interest: none.

Int J Radiation Oncol Biol Phys, Vol. 92, No. 3, pp. 532e535, 2015 0360-3016/$ - see front matter http://dx.doi.org/10.1016/j.ijrobp.2015.02.038

Volume 92  Number 3  2015

(2). In fact, the majority of TXIT examination contributors have had roles with the ABR as question contributors to the certification examination and to the recertification examination, the maintenance of certification program, and the oral boards. The TXIT examination has undergone changes throughout the last 2 years to reflect weighting of categories relevant to clinical practice while incorporating changes in the basic sciences related to the radiation oncology patient population. Like the ABR certification examination, the TXIT examination also includes questions pertinent to treatment planning and radiation-correlated morbidity. Although the content of the 2 examinations are aligned, the tests are fundamentally different and administered by 2 separate entities. The ABR board examination is a criterionreferenced examination, and the ACR TXIT examination is a norm-referenced examination (3, 4). A criterion-referenced test reports the individual’s performance as to whether the questions were answered correctly. A norm-referenced test reports whether the individual answered more questions correctly than the other test takers. These examination designs are, by definition, fundamentally different in both their aims and in what they seek to measure. The ABR calculates an individual’s examination score and compares it against a predetermined minimum acceptable performance level. This level is established by a group of content specialists and educators who determine the expected competency threshold utilizing a modified Angoff method (5, 6). In addition, the board examinations are occupational tests, whereas the ACR examinations are educational tests. The ACR has been consistent and clear that it is not, nor has it ever been, the intent of the examination’s scores to be used as the sole factor in evaluating residents or as “prep” for the ABR board examination. In fact, the ACR provides detailed information regarding this subject in the score packets that are sent each year to participating program directors. The score packet itself states, “The purpose of the examination is to provide your residents with information that is useful to them in evaluating their own progress and to provide you with data that is helpful in analyzing and evaluating your program. The examination is intended to be a measure of general achievement in radiation oncology and related areas for residents and for program directors. It should not be used as the ONLY measure of examinees’ performance for qualification to any postgraduate program.” Likewise, the results of the examination should not be used for pre-employment criterion. Each year, after the administration of the examination and tabulation of the resident responses, a full psychometric test and item analysis is conducted to ascertain how the test performed statistically overall and how each of the 300 items statistically performed individually. At this stage of validation, if any problematic items are identified, they are reviewed by the appropriate committee members, and steps are taken as necessary (eg correcting a key, 0-weighting an item) before finalizing calculation of test scores and reporting to institutions.

Education Table 1

533

2013 examination psychometric data

Items Effective length Maximum Median Minimum Mean SD Skew Kurtosis

a

SEM Mean P Mean D raw Mean Pearson Mean biserial SD of P SD of biserial

302 302 250 189 78 187 24.86 0.47 3.24 .915 7.26 þ0.62 11.53 0.20 0.28 þ0.23 0.17

Another important component of any examination instrument’s performance is the matter of its reliability, which refers to the repeatability of test scores. The TXIT examination’s reliability and its psychometric performance meet and exceed all of the relevant rigors required by testing industry standards (Tables 1 and 2). For example, the coefficient a serves as a measure of internal consistency or statistical homogeneity of a scale and thereby provides an estimate of the scale’s reliability. Testing instruments with as above the mid-80th percentile are considered highly reliable; the 2014 ACR TXIT examination has an a of 90.9%, well above the industry standard for reliability. Furthermore, its mean biserial coefficient (item discrimination), mean item difficulty, and SDs, additional measures of psychometric validity, are well within acceptable standards for the last 5 years. The data

Table 2 2014 examination psychometric data Items Effective length Maximum Median Minimum Mean SD Skew Kurtosis

a

SEM Mean P Mean D raw Mean Pearson Mean biserial SD of P SD of biserial

300 300 242 173 109 171.66 22.68 0.12 2.63 0.909 7.45 þ0.57 12.10 0.18 0.25 þ0.23 0.17

534

International Journal of Radiation Oncology  Biology  Physics

Education

Table 3

Mathematic definitions and formulas

Statistic Items

Maximum Median

Minimum Mean SD

Definition Items is the number of test items included in the analysis of each scale. Items may be, and usually are, included in several scales, causing the sum across scales to exceed the length of the test. The maximum is the highest score encountered for each scale. The median is the middle-most score of the distribution of scores for each scale. If the number of valid examinees is even, the median is taken as the midpoint of the 2 middle-most scores. The minimum is the lowest score encountered for each scale. The mean is the arithmetic P average for each scale across all examinees. Mathematically, XZ i Xi =N. The SD or standard deviation is a standard measure of dispersion of scores around ffi rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi P the mean of the scores. Mathematically, SZ ð ðXi  XÞ2 Þ=N . Note that the i

Skew

descriptive (biased) formula is used in calculation of the SD. The skew indicates the degree of asymmetry in the distribution of scores. A positive value indicates that the tail of the distribution stretches toward high scores; a negative value indicates that the tail extends toward the low scores. Technically, the Fischer Skewness is computed. Mathematically, P skewZ½ð ðXi  XÞ3 Þ=N=S3 . i

Kurtosis

The kurtosis indicates the degree of peakedness in a distribution of scores. The Pearson Kurtosis is calculated. This differs from the Fisher Kurtosis by a constant factor of 3. The Pearson Kurtosis of a normal distribution is 3.0. P Mathematically, it is computed as KurtosisZ½ð ðXi  XÞ4 Þ=N=S4 .

a

Coefficient a, a measure of the internal consistency or statistical homogeneity of a scale, provides an estimate of the scale’s reliability. Alpha is the generalization 0 P 21 Sg C g k B of the KR-20 reliability formula. Mathematically, aZk1 @1  S2 A.

i

X

See references (7-9).

are evaluated further to determine their fit to a normal or Gaussian distribution (skew, kurtosis, mean raw D, mean Pearson, and SEM). Table 3 provides the mathematic definitions and formulas. The residency program directors are provided comprehensive score information yearly concerning mean normreferenced scores at the national, institutional, and individual levels, along with score data for each of the clinical areas of practice, physics, and radiobiology. The percent correct score is obtained when the resident’s number of items answered correctly is divided by the total number of questions in the test or in a section and then multiplied by 100. The percentile rank, however, indicates a resident’s relative position to their peers at the same level of training. For example, a percentile rank of 78% means that the resident scored higher than 78% of other residents at the same level of training. A resident’s percentile rank can vary depending on which group or residency level is used to determine the ranking. As an example, on a given test section, a percent score of 52% could result in a percentile rank of 78% for level 1, 65% for level 2, 56% for level 3, and 35% for level 4. For most test items the knowledge level of the group increases with the level of training. Therefore, the percentile ranks tend to be lower for higher-

level residents when compared with the same percent correct scores for lower-level residents. As of 2014, resident TXIT examination scores were incorporated into the Radiological Society of North America/Association of Program Directors in Radiology myPortfolio tool (9). This information is made available to be used independently by residents and their program administrators to track residents’ educational progress throughout their training. The answer key for the examination is also provided to the program director annually, with the recommendation that the key be used in an instructional manner, such as in follow-up discussions or meetings with residents regarding their test answers. The most recent examination is also published annually on the ACR website, complete with rationales including annotated bibliographies provided by section experts suggested for further reading on each topic. The breadth of this feedback to residents will enable them to assess their own progress in more detail and to build their competencies with focused study efforts guided by these data. The ACR seeks the utmost quality and excellence in all of its endeavors. To that end, we understand that the process of professional assessment is one of continuous improvement. We will update the survey questions at the completion of the examination to enable each resident to provide us with

Volume 92  Number 3  2015

meaningful timely feedback. We will continue to introspectively analyze the relevance of the test material with consideration of future content for emerging science and treatment guidelines. The ACR remains committed to providing the examination to allow residents a means for self-assessment of progress over 4 years of specialty training and strives to make the examination an important and useful tool.

References 1. Morris A. The radiation oncology in-training exam: An appeal for better testing. Int J Radiat Biol Oncol Phys 2013;87:443-445. 2. The American Board of Radiology. Radiation Oncology Study Guides. Available at http://www.theabr.org/ic-ro-study. Accessed November 23, 2014.

Education

535

3. American Educational Research Association, American Psychological Association, National Council on Measurement in Education. Standards for Educational and Psychological Testing. Washington, DC: AERA Publications; 2014. 4. Millman J, Greene J. The specification and development of tests of achievement and ability. In: Linn RL, editor. Educational Measurement. Phoenix, AZ: Oryx Press; 1993. p. 335-366. 5. Ricker KL. Setting cut-scores: A critical review of the Angoff and modified Angoff methods. Alberta J Educ Res 2006;52:53-56. 6. Angoff WH. Differential item functioning methodology. In: Holland PW, Wainer H, editors. Differential Item Functioning. Hillsdale, NJ: Lawrence Erlbaum Associates; 1993. p. 4. 7. Lord FM. Standard errors of measurement at different ability levels. J Educ Measurement 1984;21:239-243. 8. Lord FM, Novick MR. Statistical Theories of Mental Test Scores. Reading, MA: Addison-Wesley; 1968. 9. Radiological Society of North America. RSNA/APDR Resident Learning Portfolio. Available at: www.rsna.org/RSNA-APDR_ Resident_Learning_Portfolio.aspx. Accessed December 31, 2014.

Radiation Oncology Resident In-Training Examination.

Radiation Oncology Resident In-Training Examination. - PDF Download Free
149KB Sizes 1 Downloads 20 Views