Automated Selection of Clinical Data to Support Radiographic

Interpretation

Peter Haug and David Beesley, Departnent of Medical Informatics, University of Utah, School of Medicine Abstract Schreiber also investigated the effects of clinical data. Modem computerized hospital information systems can He asked radiologists to read 100 films twice[5J. store much ofthe clinical information required for Approximately one-third contained a recognizable patient care. In some settings the amount of this abnormality. One reading was made with the real information is great enough to interfere with its patient history and the other was made without clinical efficient use. Supplying information in support of information. A significant reduction in false negative radiographic interpretation provides an example of this results was noted. problem. We describe an automatedprocess that abstracts select data from a clinical database and creates a These observations suggest that the availability of report recounting salient facts designed to assist the appropriate clinical data enhances radiologists' radiologist in interpreting the chest x-ray. The performance. The principal difficulty is to consistently reporting system selects data using a combination of provide this clinical data in an acceptable form. The expert systems technology and information theory. It current generation of hospital information systems produces a briefsummary ofrelevant clinical data. The contains substantial amounts of clinical data. Enough original version of this process required run-time use of information is present that, if an effective way is found to expert systems techniques that were time-prohibitive in bring this information to the radiologist, we may expect the clinical setting. Througb the use ofpre-compiled the accuracy and clinical usefulness ofhis interpretation to tables of fnding specific information contents we have increase.

reduced the time requirement to a point where the applications will soon be clinically feasible.

Introduction

However, the radiologist cannot review all ofthe data. Instead he requires tools that investigate the available data

and report to him a manageable subset of information that is relevant to the examination he is interpreting. We are The promise ofMedical Information Systems is based, in currently testing the hypothesis that, in a clinically part, on their ability to capture clinical data and to provide oriented hospital information system, tools based on this data to health care workers in a timely and effective expert systems technology and information theory can be manner. As growing portions of the medical record come used to select and report relevant clinical data. to reside in large clinical databases the need will grow for systems that do not just regurgitate the data. Future The consult system that we have developed uses the health care systems must be able to summarize patient probabilities produced by an expert diagnostic system to information and produce reports that focus immediately on derive a measure of importance for each ofthe clinical the information needed for specific tasks. findings from the patient record. A limited subset of relevant information can then be presented. A major An example ofthe need for a restricted view into the drawback has been the processing time required to produce clinical data comes from the Radiology Department. A these probabilities. The original version of the system variety of studies have demonstrated weaknesses in the used an iterative process that required multiple accuracy of reporting. A focus on these studies has applications of the expert system. Below we describe an been the reliability of interpretations for chest xexperiment designed to determine the potential for rays[l1,[21,[3J. reducing this time by developing pre-compiled tables of probability-based relevance scores. Insight into ways to improve the accuracy of radiologists comes from studies regarding the effect of Methods clinical information on their performance. Doubilet formally evaluated the effect of a brief, suggestive The HELP system is discussed only briefly here; it is history on the recognition of subtle abnormalities[4]. described in detail elsewhere[6J. The salient features of Both resident and staff radiologists read the films with this medical information system are its integrated and without historical information. Providing a clinical database and on-line medical expert system. suggestive history improved recognition of The database contains a large share of the clinical abnormalities from 16% to 72% for residents and from information generated for each patient in the hospital 38% to 84% for the combined resident/staff readings. setting. The data are stored in a form immediately 0195-4210/91/$5.00 © 1992 AMIA, Inc.

593

accessible to not only the clinicians and nurses caring for the patient, but also to the expert system, which is designed to provide decision support to all health care personnel.

We feel that this approach has broad relevance both in the evahlation of clinical information already available and in the prediction of that information that would best further the diagnostic process.

The HELP system contains expert systems tools, which are used to provide routine decision support throughout the hospital. Both rule-based and statistically-based approaches are supported. Medical decision logic is captured from clinicians and is supplemented through techniques that allow the incorporation into the logic of information derived from large clinical databl7J.

The initial prototype for this consultative system has been developed in the realm of chest radiology. Our approach combines an expert diagnostic system with tools that use probabilities provided by this expert system to calculate the information content for data elements stored in the HELP system's clinical database. These tools currently screen up to 21 1 pieces of clinical information in an attempt to find the data most relevant to the radiologist. The data elements screened are those that contribute to diagnosing 30 pulmonary diseases. The goal ofthe system is to choose a maximum of 25 of these data elements, which will then be reported to the radiologist at the time he evaluates the chest films. These diseases are represented by 30 modules of diagnostic logic written in the Arden Syntax (MLMs). Figure 1 illustrates the components and flow ofthis system.

The evolving Arden Medical Knowledge Syntaxl81 forms the basis for the language in which medical expertise is captured. This syntax provides a language designed specifically for encoding medical logic in a modular, easy-to-read form. These modules are called Medical Logic Modules (MLMs). The version used in the HELP system has been extended to allow representation of diagnostic logic in a form that supports the estimation disease probabilities using simple Bayesian techniques.

Figure 1

The principal goal of the system described here is to provide a brief summary of the clinical information that is most likely to be useful to the radiologist. This relevant information must be selected from a set of patient data that would often fill a number of typewritten pages if were all retrieved. To maximize the impact and value for the radiologist, the system must choose a subset of this data that can be reported on a maximum of one-half of a page. Our approach is to borrow an algorithm from information theory and to use it to identify the most "infonnative" of the clinical data available in the patient records.

Shannon Information Theory is well know in the realm of communications[9J. However, in the clinical setting, only modest amounts of experience with these algorithms exist. Among these efforts is the work of Pitkeathly. He used a measure of information content to characterize the value of radiological and laboratory investigations in patients with suspected inflammatory arthritis[ l01 . He relied on the reported probability assessments from the clinicians to calculate information content.

Other experience with information theory in radiology includes that ofBigongiari who described a model for the use of information theory to rate the diagnostic value of radiographic signs in the evaluation of renal mass[ I 1. The techniques discussed were successful in predicting the most helpful signs to report in evaluating the likelihood of competing diagnoses and in assessing the infonmation gained from each piece of data collected.

Information theory has also been used to evaluate clinical data in other contexts such as cardiologyl 12].

Probablistlc Diagnostic MLMB

As the system functions, each piece of information is extracted from the patients computerized record and is placed in a file. This file serves as a blackboard where the consultative process can post individual data elements. The expert system calculates a probability for each disease with and without this element of information and then publishes the disease probabilities that result from this information in the blackboard file. The first calculation, made without any clinical data, provides the prior probability of that disease. The second calculated value reflects the increase or decrease in disease probability associated with the data element evaluated. As an example, the presence of a cough or an elevated white blood count will raise the probability of pneumonia while the absence of a cough or a normal white count will lower it. The likelihoods derived for all of the disease MLMs that use these findings form the substrate upon which the system works to determine the relative diagnostic value of the cough and the white count.

594

The values represented by equations 3 and 4 represent an estimate of the information provided by the finding F in the context of the disease Di. The sum of the IC(Dj)'s across all of the diseases for which the examination might give diagnostic aid is the information content associated with the piece of clinical data that is being evaluated.

Each time a finding is evaluated in this way, two probabilities are produced for each disease and are passed to tools that determine the amount ofinformation available from this data element. If we label the diseases DI, D2,..., Dn, then these probabilities are P(Di), the prior probability of disease Di, and P(DiIF), the new probability in patients who have the finding F. A measure of the information contributed to mling in, or out, the individual disease is calculated from these probabilities using a method based on the computation ofinformation content described by Pitkeathly[l O]. According to this approach, the information associated with the single disease Di is deduced by calculating a quantity called "the uncertainty before the addition ofinformation". This quantity is analogous to the "entropy" originally described by Shannon. The uncertainty is labeled Ha. After recalculating the probability of the disease based on the element of clinical data placed in the blackboard, the uncertainty is recalculated and labeled Hb. This is "the uncertainty after the addition of information". The information content is based on the change in uncertainly from Ha to Hb.

The system proceeds to search the patienYs record for each of the 211 data elements. For each one that it finds, it creates a temporary record in its blackboard and generates this measure of information content. When this series of calculations is complete it examines the values generated and produces a report for the radiologist. This report is currently defined to consist of the 25 findings with the greatest information content. They are organized in five parts: The most informative positive history findings are printed first. Next are the most informative negative history findings. Third are salient findings from the physical examination. The most important laboratory findings appear fourth, and, fifth we print the most important previous x-ray findings. This report is designed to be displayed on a terminal or printer as a part of a radiologistes worksheet that is generated by the radiology subsystem within the HELP system. Figure 2 is an abbreviated example of its appearance. Figure 2

These quantities are calculated using the equations below.

Ha - P(D;) Log2 P(Di) -

[I-P(Di)j L0g2 11-P(Di)j

CLINICAL DATA

I)

HTISTORY Hb

Patient Complains of:HISTORY OF ASTHMA

-P(DiIF) Log2 P(DiIF)

-

-

[l-P(DiIF)l Log02 [1-P(DiIF)J

ASTHMA ATTACK DYSPNEA DAILY COUGH - MORE THAN 2 MONTHS WHEEZING CAUSED BY INFECTION PURULENT SPUTUM PAROXYSMAL NOCTURNAL DYSPNEA ACUTE DYSPNEA Patient Denies: FEVER CHILLS CHEST INJURY

2)

A maximum uncertainty would occur if the likelihoods of all of the diseases were equal. Because of the shape of the information function, if the probability of a disease increases by a large enough amount, the uncertainty will increase to its maximum and then begin to decline. In this situation the equation for the information associated with the disease Di should be:

IC(DI)

-

(Hmax Ha) -

+

(Hmax Hb) -

3)

In the case where the change in probability does not take the uncertainty through its maximum, the information associated with the disease Di is: IC(Di)

-

lHa HbI -

Both of these equations are designed to return positive values. This departs from a strict interpretation of Shannon's theory and reflects our belief that it is the absolute change it uncertainty that makes a finding relevant.

PHYSICAL EXAM RESPIRATORY RATE ORAL TEMPERATURE

26 3 7. 0

LAl DATA POSITIVE SPUTUM STAIN WHITE BLOOD COUNT 9 7 0 0

4)

A series of prototypes of the process described above have been in testing since August of 1990. The major problem encountered has been that of timeliness. In our radiology department, a variety of paperwork is generated after each procedure is ordered. To allow for efficient functioning of

595

the department, all reports associated with an examination should be available within 5 minutes of an order for a procedure. The current prototype can take up to 3 hours to complete its processing and generate a report. This reflects the fact that the process descibed requires the MLM for each disease to be processed in the presence of each finding. To analyze 100 findings the system would process 3000 MLMs. When the system is busy, each MLMs may require more than I second to generate a disease probability. The result is unacceptably slow

perfoimance.

The original blackboard buffer was a disk file that was written to and read from by the clinical summary process and the frames. Below we present the results of two experiments that explore alternate approaches to reducing the amount of time necessary to produce a clinical summary.

The first alteration in the model that we tested was to move the Blackboard into computer memory. We developed techniques, which allowed the patient findings to be placed in volatile memory during report generation. The frames access them in memory and write their results back to this buffer. The information content for each finding can then be calculated from the probabilities left in this more efficient blackboard.

Results

The three techniques described above were tested for efficiency by timing their behavior on a test patient with 121 of the 211 findings for which the system could assess information content. For data elements that the patient did not have, all three approaches functioned similarly. They looked in the clinical database for the finding and, upon discovering it missing, simply went on to search for another potentially relevant finding. The principal difference is that both of the versions that use the blackboard model in real time begin processing by generating apriori probabilities for the 30 diseases in the model. In our testing, this took approximately 35 seconds for the disk-based blackboard and 23 seconds for the blackboard in volatile memory. In comparison, the pre-compiled version required approximately 4.4 seconds of startup time. Subsequently, all versions spent similar amounts of time for missing data elements. For data absent from the patient record the algorithms averaged 0.75 seconds per finding.

For data elements present in the patient record, the

algorithms each generated a score reflecting information

content as described above. This is where most of the processing time occurred and where the effects of the different approaches become apparent. To test their A second approach has been to replace the full version of effiiciency, each version of the clinical summary program the program with one that can use pre-calculated values for was mn for subsets ofthis data and the time required was the information contents it requires. To do this we have ecorded and anaWyzed. implemented a process that compiles tables of information content for all relevant values for each of the data elements Table I contains the results of timing experiments used by the system. Thus, values are placed in this table employing the three techniques discussed above. In each for the two possible values of cough, "Yes" and "No" and case 20-50 findings from the test patient's data were for up to eight different ranges of values for a continuous processed. As expected, the total time required was a data element such as white count. These tables allow the linear function of the number of findings analyzed. This reporting process to avoid the compute-intensive task of reflects the fact that, within each algorithm, the processing a group of diagnostic modules for each finding. processing is essentially identical for each finding tested. Instead it can access the table to find the pre-calculated information content for any value ofeach daelement.

Table I

Technique

Time per Absent Finding

Time per Present Finding

Time per MLM

Blackboard-File on Disk 0.75 seconds

49.3 seconds

1.64 seconds

Blackboard-File in

0.75 seconds

29.1 seconds

0.97 seconds

0.75 seconds

1.34 seconds

No MLMs Run

Memory

.......

Pre compiled-No Blackboard

596

Discussion The process descrilbed above produce reports that contain focused clinical data relevant to the types of diseases for which a chest x-ray might be ordered. As this tool approaches clinical implementation two questions remain. The first is whether we can integrate these processes into the work flow of a busy radiology department. No matter what the value of a clinical report in improving the accuracy of the radiologist, the information is useless ifinefficient processing prevents the delivery ofthis data. The combination ofinformation theory and the blackboard technique provide a viable approach to the problem of selecting relevant clinical information to report to the radiologist. Unfortunately this technique is expensive in terms of processing time. Fortunately, the benefits of this approach can be retained while reducing processing time. This is done by compiling the calculated scores into tables of result-specific information contents. The results in table I demonstrate the advantages of using a compiled form of the consult program. This approach has cut processing time dramatically. A report on the test patient with 121 findings would have taken 101.1 minutes using the original version of the system, 60.2 minutes using the RAM based version, and 3.90 minutes using the version with the pre-compiled information contents. This amount of time is acceptable for report generation in the clinical setting.

1. Koran, LM. The reliability ofclinical methods, data and judgements (parts I and II). NEJM. (1975) 293: 642-646 and 695-701. 2. Herman PG, et al. Disagreements in chest roentgen interpretation. Chest (1975) 68:278-282.

3. Rhea JT, Potsaid MS, DeLuca SA. Errors of interpretation as elicited by a quality audit of an emergency radiology facility. Radiol. (1979) 132: 277280. 4. Doubilet P, Herman PG. Interpretation of radiographs: effects of a clinical history. AIR (1981) 137: 1055-1058. 5. Schreiber MH. The clinical history as a factor in roentgenogram interpretation. JAMA (1963) 185:137139. 6. Pryor TA, Gardner RM, Clayton PD, Wamer HR. The HELP system. J Med Syst (1983) 7:87-102. 7. Haug PJ, Hoak S. Veristat: A Support Tool for Knowledge Development. Fourteenth Annual Symposium on Computer Applications in Medical Care, pp 650-654, 1989.

8. Hripcsak G, Clayton PD, Pryor TA, Haug PJ, Wigertz OB, and Van der lei J. The Arden Syntax for Medical Logic Modules. Fourteenth Annual Symposium on Computer Applications in Medical Care, pp 200-204, 1989.

A second and more fundamental question is whether these reports will contribute to the accuracy of the radiologists as they examine chest x-rays. In the work of Rhea, Doubilet, and Schreiber discussed above, the clinical data used was selected by clinicians, in many cases with a prior 9. Shannon CE, Weaver W. The mathematical theory knowledge of the radiologic findings contained in the of communication. Urbana IL: Univ. of Illinois Press, films. Whether a computerized system is capable of Chicago (1949). choosing data well enough to have an impact on accuracy is unknown. 10. Pitkeathly DA, Evans AL, James WB. The use of information theory in evaluating the contribution of We are exploring this question with a simple experiment. and radiological laboratory investigations to diagnosis A group of five physicians will read a collection of chest and Chin Radiol (1979) 30:643-647. management. x-rays with and without the computer-supplied clinical information. We anticipate that an analysis of their 11. Bigongiari LR, Preston DF, Cook L, Dwyer SI, reports will help to answer this question: can the Fritz S, Fryback DG, Thombury JR. automated selection and reporting of clinical data as measures ofvarious UncertaintyAnformation effectively support the radiologist as he interprets parameters: an urographic information theory model of radiographic examinations. diagnosis of renal masses. Invest Rad (1981) 16:77-8 1. This publication was supported in part by grant RO I 12. Diamond GA, Hirsh M, ForresterJS, Staniloff LM04932 from the National Library ofMedicine. HM, Vas R, Halpem SW, and Swan HJC. Application of information theory to clinical diagnostic testing. Circulation (Nov 1981) 63:915-921.

597

Automated selection of clinical data to support radiographic interpretation.

Modern computerized hospital information systems can store much of the clinical information required for patient care. In some settings the amount of ...
888KB Sizes 0 Downloads 0 Views