ORIGINAL CONTRIBUTION quality assurance; radiograph, misinterpretation

Clinically Significant Radiograph Misinterpretations at an Emergency Medicine Residency Program Radiographic misinterpretation rates have been suggested as a quality assurance tool for assessing emergency departments and individual physicians, but have not been defined for emergency medicine residency programs. A study, was conducted to define misinterpretation rates for an emergency medicine residency program, compare misinterpretation rates among various radiographic studies, and determine differences with respect to level of training. A total of 12,395 radiographic studies interpreted by emergency physicians during a consecutive 12-month period were entered into a computerized data base as part of our quality assurance program. The radiologist's interpretation was defined as correct. Clinical significance of all discrepancies was determined prospectively by ED faculty. Four hundred seventy-five (3.4%) total errors and 350 (2.8%) clinically significant errors were found. There was a difference in clinically significant misinterpretation rates among the seven most frequently obtained radiograph studies (P < .0005, X2), accounted for by the 9% misinterpretation rates for facial films. No difference (P - .421) was noted among fulltime, part-time, third-year, second-year, and "other" physicians. This finding is likely due to faculty review of residents' readings. Evaluation of misinterpretation rates as a quality assurance tool is necessary to determine the role of radiographic quality assurance in emergency medicine resident training. Educational activities should be directed toward radiographic studies with higher-than-average reported misinterpretation rates. [Gratton MC, Salomone JA III, Watson WA: Clinically significant radiograph misinterpretations at an emergency medicine residency program. Ann Emerg Med May 1990;19:497-502.]

Matthew C Gratton, MD, FACEP Joseph A Salomone, III, MD, FACEP William A Watson, PharmD Kansas City, Missouri From the Department of Emergency Health Services, School of Medicine, University of Missouri-Kansas City, and Truman Medical Center, Kansas City, Missouri. Received for publication October 5, 1989. Accepted for publication January 24, 1990. Address for reprints: Matthew C Gratton, MD, FACER Department of Emergency Health Services, Truman Medical Center, 2301 Holmes, Kansas City, Missouri 64108.

INTRODUCTION Radiographic misinterpretation rates (MIRs) have been suggested as a quality assurance tool for assessing emergency departments and individual physicians. 1 Previous studies have reported MIRs for different types of EDs; however, MIRs have not been defined for emergency medicine residency programs. 2 9 The purposes of this study were to define MIRs for an ED with a residency program, compare MIRs among various radiographic studies, and determine whether differences in MIRs exist with respect to level of training.

METHODS Truman Medical Center is an urban teaching hospital associated with the University of Missouri-Kansas City School of Medicine. It is a Level I trauma center that treats approximately 38,000 adult (aged more than 15 years) patients per year. The Department of Emergency Health Services has administered an emergency medicine residency (PGY1, 2, 3) since 1973. The department is generally staffed by one PGY3 emergency medicine resident, one PGY2 emergency medicine resident, two senior medical students, and one attending staff who is board-certified or prepared in emergency medicine. One internal medicine or family practice resident or one PGY1 emergency medicine resident also may be staffing the department at a given time. All radiographs ordered in the ED from January 1 to December 31, 1988, 19:5 May 1990

Annals of Emergency Medicine

497/21

RADIOGRAPH MISINTERPRETATIONS Gratton, Salomone & Watson

TABLE 1. MIRs by type of study

Study

Skull Face Mandible Cervical spine Thoracic-spine Thoracic-lumbar spine Lumbar-sacral spine Pelvis Clavicle Scapula Shoulder Humerus Elbow Forearm Wrist Hand Fingers (including thumb) Hip Femur Knee(including patella) Leg (tibia-fibula) Ankle Foot Calcaneus Toes Chest Sternum Abdomen (includes acute abdominal series/kidney, ureter, bladder) Head computed tomography Soft tissue neck Sinus Rib Miscellaneous (includes studies lost)

were included in the study. Radiographic studies that required the presence of a radiologist, such as ultrasounds, intravenous pyelograms, and intravascular contrast studies, were excluded. N o n c o n t r a s t head computed tomography scans were included in the study. Emergency physician interpretations were recorded on the patient's chart and on a blue 3 x 5 card that accompanied the ra22/498

No. of Studies

No. of Total Errors (%)

99 488 157 1,008 76 39 241 367 29 22 379 108 217 173 375 613 293 129 91 533 266 693 487 24 92 4,429 11

2 59 10 39 2 1 6 5 0 1 12 2 5 4 7 24 7 1 1 18 5 22 23 0 3 139 0

(2.0) (12.1) (6.4) (3.9) (2.6) (2.6) (2.5) (1.4)

524 216 47 65 32 72 12,395

14 7 3 2 0 1 425

(2.7) (3.2) (6.4) (3.1)

(4.5) (3.2) (1.9) (2.3) (2.3) (1.9) (3.9) (2.4) (0.8) (1.1) (3.4) (1.9) (3.2) (4.7) (3.3) (3.1)

(1.4) (3.4)

diograph back to the radiology department in the radiograph jacket. Discrepancies between the ED and radiology interpretations were defined as abnormalities identified by the radiologist that were not noted by the emergency physician. These discrepancies were called "underreads." No attempt was made to look at findings seen by the emergency physician and not confirmed by the Annals of Emergency Medicine

No. of Clinically Significant Errors (%)

2 44 9 38 2 1 4 5 0 1 5 2 5 1 6 18 3 1 0 15 5 16 18 0 2 124 0

(2.0) (9.0) (5.7) (3.8) (2.6) (2.6) (1.7) (1.4)

11 7 3 2 0 0 350

(2.1) (3.2) (6.4) (3.1)

(4.5) (1.3) (1.9) (2.3) (2.3) (1.6) (2.9) (1.0) (0.8) (2.8) (1.9) (2.3) (3.7) (2.2) (2.8)

(2.8)

radiologist. Discrepancies were discovered in two ways. First, the radiologist, on reviewing the radiograph and the card with the ED interpretation, would call the ED attending staff if, in his opinion, a significant discrepancy existed. Second, all radiographs obtained and the ED interpretations were entered in a computer data base on a daily basis. As final r a d i o l o g i s t r e p o r t s b e c a m e 19:5 May 1990

TABLE 2. MIRs in most frequently obtained studies Rank

Study

No. of Studies (%)

No. of Total Errors (%)

Clinically Significant Errors (%)

124 (2.8) 4,429 (35.7) 139 (3.1) 38 (3.8) 1,008 (8.1) 39 (3.9) 16 (2.3) 693 (5.6) 22 (3.2) 18 (2.9) 613 (4.9) 24 (3.9) 15 (2.8) 533 (4.3) 18 (3.4) 11 (2.1) 524 (4.2) 14 (2.7) 44 (9.0) 488 (3.9) 59 (12.1) 266 8,288 (66.7) 315 There is a,statistically significant difference in total and clinically significant MIR (P < .0005) among the seven most frequently obtained studies, acc6unted for by the high total and clinically significant MIR for facial films. 1 2 3 4 5 6 7

Chest Cervical spine Ankle Hand Knee Abdomen Face

TABLE 3. Studies with highest clinically significant MIRs No. of Studies Face Soft tissue neck Mandible Scapula Cervical spine Foot

488 47 157 22 1,008 487

available, these were entered in the data base and compared with the ED interpretation by an accredited medical records technician. If a discrepancy was noted, the ED attending staff members were notified. On notification by either the radiologist or the ED medical records staff, the attending emergency physician reviewed b o t h the p a t i e n t ' s chart and the conflicting radiograph interpretation to determine if a discrepancy existed. If a discrepancy was present, the attending emergency physician determined whether it was clinically significant. Clinical significance was defined as any discrepancy that required the emergency physician to recontact the patient. This would include a phone call or letter urging immediate return to the ED (ie, possible missed cervical-spine fracture), c o n t a c t urging early clinical follow-up (ie, evidence of missed chip fracture), or routine clinic follow-up (ie, missed pulm0nary nodule). Which patients needed recontact was determined solely by the clinical judgment of the ED at19:5 May 1990

Clinically Significant MIR (%) 44 3 9 1 38 18

(9.0) (6.4) (5.7) (4.5) (3.8) (3.7)

tending staff. The radiologist's interpretation was defined as correct. Radiologist readings including such words as "possible" or "cannot rule out" were included. Total and c l i n i c a l l y s i g n i f i c a n t MIRs were compared among radiographic studies by the frequency that a study was obtained and by the level of training of the physician interpreting the radiograph. A X2 analysis of multiple groups was used to compare MIRs, with P ~< .05 defined as statistical significance. Power was calculated when P > .05 using the K-sample calculation (STPLAN, copyright 1986, U n i v e r s i t y of Texas System Cancer Center, MD Anderson Hospital and Tumor Institute Department of Biomathematics). RESULTS In all, 12,395 radiographic studies interpreted by emergency physicians were entered in the ED data base during the 12-month study period. During this period, there were 35,484 patient visits to the ED. There were Annals of Emergency Medicine

425 (3.4%) total discrepancies, with 350 (2.8%)judged clinically significant. Forty-one of the errors were not sufficiently d o c u m e n t e d to determine clinical significance. For study purposes, these discrepancies were assumed to be clinically significant and are included in the total of 350. Misinterpretation rates by specific radiographic study are listed (Table 1). Table 2 lists the seven most comm o n l y obtained studies and their MIRs. There was a statistically significant difference (P < .0005) in total and clinically significant MIRs among the seven most frequently obtained studies. This difference was due to the high MIR rate for facial films. Table 3 lists the six radiographic studies with the highest clinically significant MIRs. Misinterpretation rates by level of training are listed (Table 4). There was no statistically significant difference in MIRs among the different physician groups. DISCUSSION The Joint Commission on Accreditation of Healthcare Organizations requires that all radiographs ordered and read by emergency physicians be reviewed by a radiologist and that a mechanism be in place to recall patients if significant errors are made.10 In the course of complying with this regulation, a radiographic misinterpretation rate was generated for our ED. In attempting to compare our MIR of 3.4% with that of other EDs, we could not find a "gold standard." A number of studies have reported MIRs ranging from 2.4% to 8.9% (Table 5).2.9 Few of the studies accurately reflect our setting. 499/23

RADIOGRAPH MISINTERPRETATIONS Gratton, Salomone & Watson

TABLE 4. MIR by level of training

Staff Full time Part time Residents Third year Second year Rotating Unidentified P Power

No.

No. of Studies

Total Errors (%)

Clinically Significant Errors (%)

7 12

2,173 733

74 (3.4) 24 (3.3)

62 (2.9) 19 (2.6)

12 13 29

4,730 3,316 1,293 150

(3.4) 137 (2.9) (3.7) 103 (3.1) (2.9) 27 (2.1) (2.0) 2 (1.3) .678 .421 0.87 0.82 There was no statistically significant difference in total or clinically significant MIRs among the different levels of providers.

One study included only missed fractures 2 and one involved a pediatric hospital. 3 A s t u d y from an emergency medicine residency only reported error rates for faculty and did not include residents. 4 Three British studies reported readings by "casualty officers. ''7-9 Two studies from emergency medicine residencies that did not separate readings by physician training had MIRs of 5.4% and 2.4%.5, 6 In a recent survey by O'Leary et al, 1~ 172 ED directors estimated the ED to radiology department discrepancy rate to be 4.6%. In the same study, 159 radiology department directors estimated the discrepancy rate to be 5.9%. Using our MIR and the previously reported data, one would be tempted to estimate an MIR gold standard of approximately 5%. If the standard is to be used by individual EDs to assess their own performances, then this estimate is acceptable. If MIRs are to be used by some regulatory body to declare one ED superior to another, then significant problems remain. First, what is the definition of discrepancy? If the emergency physician reading of a chest radiograph is "no infiltrate" and the radiologist says "chronic obstructive pulmonary disease," is this a discrepancy? Second, can the radiologist reading be used as the gold standard for judging the ED reading? Underread rates of 3% and 4.5% have been reported for radiologist readings of ED films.Z, 8 Until these questions are answered, it is inappropriate to use MIRs for anything more than tools for gross comparison of one aspect of ED function. 24/500

162 124 38 3

In an attempt to be more relevant to patient care, clinically significant MIRs can be generated. Clinically significant MIRs have been reported to range from 0.06% to 3% (Table 5). Unfortunately, the definition of clinically significant varied widely between studies, ranging from "not defined ''7,9 to " t r e a t m e n t potentially altered, ''5 and " t r e a t m e n t significantly altered. `'6 One study based clinically significant on "specific preestablished criteria. ''4 These criteria included but were not limited to any undetected finding that would change immediate treatment or follow-up, and incidental findings, even if unrelated to the chief complaint, that required a change in treatment or follow-up. Our definition of clinically significant was most comparable to that used by Fleisher et a13: A finding that "led to an attempt to recall" the patient. Our clinically significant MIR rate of 2.8% was similar to their rate of 3.0%. As with total MIRs, a gold standard of less than 3% clinically significant MIR could be postulated and used as a yardstick by individual EDs. If the clinically significant MIR is to be used as a more precise measure by evaluative agencies, the same problems apply as with total MIRs. Particularly troublesome is the definition of clinically significant. For example: A patient is seen for a severe ankle sprain with the radiograph interpreted as negative in the ED. The ankle is splinted and the patient told to follow up with an orthopedist in a few days. Later, the radiologist interpretation shows a small acute avulsion fracture of the lateral malleolus. Annals of Emergency Medicine

A fracture was missed, but there would have been no change in treatment had the fracture been apparent. Would there be a change in followup? Even if no change in follow-up was necessary, should the patient be contacted to reconfirm follow-up and notify him of the error? In our opinion, the answers to these types of questions will be determined by the physician based on individual judgment. If institution-to-institution comparisons are difficult, can MIRs be used to compare physicians within a given i n s t i t u t i o n that p r e s u m a b l y uses consistent criteria to define discrepancies and " c l i n i c a l l y significant"? Although individual physicians were not c o m p a r e d in this study, groups of physicians with different levels of training were, and no statistically significant difference in MIRs or clinically significant MIRs was found (Table 5). It is unlikely that rotating resident physicians or PGY2 emergency medicine residents can read radiographs as well as full-time staff physicians. We believe the most likely explanation for this lack of difference in MIRs by different levels of training is staff physician review of resident films at the time of initial ED reading. The policy at our institution is for ED staff to review all cases with rotating and PGY1 residents prior to patient discharge. This should include review of radiographs. Particularly difficult radiographs would be most likely to be brought to staff physicians by all residents. This staff review would have an equalizing effect on MIRs that would obscure group 19:5 May 1990

TABLE 5. O t h e r E D M I R s t u d i e s

Setting Teaching hospital ED, 19842

Type of Studies "Missed fractures"

Readers

No. of Studies

Total MIR

Clinically Significant MIR

4,907

162 (3.3%)

16 (0.33%) ~

564

50 (8.9%) (discordant) 361 (5.36%)

16

Pediatric ED, 19833

All

Emergency medicine residency, 19874 Emergency medicine r~sidency, 1988~ Emergency medicine residency, 19776 Teaching hospital (England), 19807 England, 19858

All

Attending housestaff (not emergency medicine residency) Attending and residents Faculty only

All

Emergency physicians

1,872

102 (5.4%) (discordant)

All

ED reading

8,021

196 (2.4%)

All

Casualty officer

531

All

Casualty officer

England, 19839

All

Casualty officer

d i f f e r e n c e s . T h i s e q u a l i z i n g effect w o u l d invalidate physician-to-physician c o m p a r i s o n s in a t e a c h i n g setting. A n o t h e r e x p l a n a t i o n for the similarity of MIRs a m o n g groups w o u l d be a different case mix. If staff and t h i r d - y e a r r e s i d e n t s saw m o r e difficult cases, and, therefore, p r e s u m a b l y more difficult radiographs, their MIRs m i g h t be s i m i l a r to those of second-year residents and rotators. M i s i n t e r p r e t a t i o n r a t e s m a y be useful to i d e n t i f y f i l m s m o r e l i k e l y to be misread. W h i l e Overton 4 found no specific types of radiographs more l i k e l y to be misread, o t h e r studies h a v e f o u n d c h e s t r a d i o g r a p h s ; 6 navicular, elbow, and c a l c a n e u s radiographs; 2 and radiographs w i t h "subtle b o n y a b n o r m a l i t i e s " a n d " i n c i d e n t a l findings ''3 m o r e l i k e l y to be missed. O u r study found facial films m o r e likely to be misread. High clini c a l l y s i g n i f i c a n t M I R s also w e r e f o u n d for soft t i s s u e of t h e n e c k , mandible, scapula, cervical spine, and foot films. These data suggest that films w i t h high MIRs can be identified and that t h e y m a y differ f r o m i n s t i t u t i o n to i n s t i t u t i o n . R e a s o n s for d i f f e r i n g m i g h t include different definitions of discrepancy and/or clinically significant, different v o l u m e of radiograph 19:5 May 1990

6,740

Definition of Clinical Significance "... Judged to have clinical importance"

(3%) -->

"Led to attempt to recall" 40 (0.59%) --> "Based on specific pre-established criteria" 38 (2.7%) ~ "Treatment potentially altered" 5 (0.06%)

"Significantly altered treatment"

(2.5%)

Not defined

1,496

(3.5) (False-negatives) 63 (4.2%)

34 (2.3%)

"Important"

1,000

44 (4.4%)

23 (2.3%)

Not defined

types, and differences in educational activities. Educational activities can be m o d i f i e d to a d d r e s s f i l m s w i t h high MIRs. Several difficulties are i n h e r e n t in this type of study. Problems w i t h the d e f i n i t i o n of d i s c r e p a n c y and clinically significant have been discussed. Because this study was done as part of an ongoing d e p a r t m e n t quality assurance program, borderline cases were included in the clinically signifi c a n t MIR group to e n s u r e p a t i e n t follow-up. T h e degree to w h i c h this practice inflated clinically significant MIRs is u n c e r t a i n . T h e p r o b l e m of ED over-reads was n o t addressed but certainly m a y represent a large n u m ber of m i s i n t e r p r e t e d r a d i o g r a p h s w i t h u n c e r t a i n clinical importance. CONCLUSION Total and clinically significant radiograph m i s i n t e r p r e t a t i o n rates can be g e n e r a t e d by an ongoing departmental quality assurance program. Use of MIRs as a gross evaluation of one aspect of ED function m a y be appropriate. Use of MIRs as a specific y a r d s t i c k to rate EDs by outside accrediting agencies is n o t appropriate w i t h o u t f u r t h e r r e f i n e m e n t in t h e definitions of discrepancy and clinically significant. Use of MIRs to rate individual physicians in the teaching Annals of Emergency Medicine

s e t t i n g is c o n f o u n d e d by staff overv i e w of r e s i d e n t s ' r a d i o g r a p h readings. Educational activities should be directed toward radiographic studies with higher-than-average reported MIRs. The authors thank Jayna Ross, executive secretary, for her help with manuscript preparation and Nancy Stratton, ART, for her data collection. REFERENCES 1. American College of Emergency Physicians: Quality Assessment in the Emergency Depart ment. Des Plaines, Illinois, ACEP~1984, p 8, 39,

41. 2. Freed HA, Shields NN: Most frequently over~ looked radiographically apparent fractures in a teaching hospital emergency department. Ann Emerg Med t984;13:900-904. 3. Fleisher G, Ludwig 8, McSorley M: Interpretation of pediatric X-ray films by emergency department pediatricians. A n n Emerg Med 1983;12:153-158. 4. Overton DT: A quality assurance assessment of radiograph reading accuracy by emergency medicine faculty (abstract). Ann Emerg Med 1987;16:503. 5. Mayhue FE, Rust DD, Alday JC, et a]: Accuracy of interpretations of ED radiographs: The effect of confidence levels (abstract).Ann Emerg Med 1988;17:394. 6. Quick G, Podgorny G: An emergency department radiology audit procedure. JACEP 1977; 6:247-250. 7. deLacey G, Bother A, Harper J, et al: An as501/25

RADIOGRAPH MISINTERPRETATIONS Gratton, Salomone & Watson

sessment of the clinical effects of reporting accident and emergency radiographs. Br J Radiol 1980;53:304-309. 8. Berman L, deLacey G, Twomey E, et al: Reducing errors in the accident department: A simple method using radiographers. Br Med J

26/502

1985;290:421-422. 9. Mucci B: The selective reporting of X-ray films from the accident and emergency department. Injury 1983;14:343-344. 10. Joint C o m m i s s i o n on Accreditation of Healthcare Organizations: Accreditation Man-

Annals of Emergency Medicine

ual for Hospitals, 1988. Chicago, JCAHO, 1989, p 45. 11. O'Leary MR, Smith M, Olmstead WW, et al: Physician assessment of practice patterns in emergency department radiograph interpretation. Ann Emerg Med 1988;17:1019-1023.

19:5 May 1990

Clinically significant radiograph misinterpretations at an emergency medicine residency program.

Radiographic misinterpretation rates have been suggested as a quality assurance tool for assessing emergency departments and individual physicians, bu...
456KB Sizes 0 Downloads 0 Views