Lung (1990) Suppl: 1193-1200

An Expert System for Synoptic Interpretation of Lung Function Tests D. Heise, P. Kroker, and A. Mail~tnder Medizinische Universi~tsklinik, Abt. Pneumonologie, Homburg/Saar, Federal Republic of Germany

Abstract. We simulate the interpretation process by the testing of preformed working hypotheses. A clinical syndrome, "bronchial obstruction," is described by a set of suitable parameters (FEVt, MMEF, Raw, etc.). For a given patient, this set forms a normalized vector. It has to be compared with equivalent data derived from patients which fulfilled the criteria for the clinical syndrome in question. If the patient's vector has a similar direction as the vector of the collective, the working hypothesis is accepted. The length of the vector is then used to quantify the severity of the functional disturbances in verbal terms ("slight," "moderate," "severe"). The limits used for severity grading and the typical parameter pattern for the given syndrome are adapted to the user's criteria by a built-in learning capability. On the other hand, the assembled data may be used for the training of newcomers. The use of vector algorithms allows for a high flexibility of our program with respect to all methods used in lung function testing.

Key words: Expert systemmInterpretation--Lung function tests. Introduction

We developed an expert system for the synoptical interpretation of lung function test results. This program does not use rigid decision trees, but it works similarly to a physician who tests a series of working hypotheses using techniques for estimation and for recognition of typical patterns. In order to get reliable results in routine assessment, we started by defining a set of working hypotheses as "bronchial obstruction," "restriction," "arterial hypoxaemia," and so on. This first step cannot be programmed, we need the experience of an expert. The following steps are programmable: quantitaOffprint requests to: Dr. D. Heise, Medizinische Universit~itsklinik, Abteilung Pneumonologie, D-6650 Homburg/Saar, FRG.

1194

D. H e i s e et al.

tive description of lung function test results, synoptic evaluation of the data in respect to a severity scale, testing of a given working hypothesis, and finally the interpretation as verbalization of numeric results. Our program has certain learning capabilities; it may adapt itself to the habits of the user. This is accomplished by variable limits for the staging of the severity of functional disturbances and by a statistical approach to recognition of typical patterns.

Mathematical Approach Since the program at one moment handles only a single working hypothesis, we may restrict on the definition of a typical example, for instance "bronchial obstruction." Bronchial obstruction can be understood as a syndrome because there must be evidence for a combination of several symptoms such as dyspnea or wheezing in order to verify this working hypothesis. In lung function testing, each conspicuous measuring result may be interpreted as a symptom, too. The information given by the different parameters must be classified in respect to the working hypothesis. An increase of airways resistance and a decrease of conductance and forced expiratory volume are strong arguments for the working hypothesis bronchial obstruction, whereas an increase of the forced expiratory volume relative to vital capacity and a decrease of total lung capacity tend to falsify this hypothesis. Flow parameters and vital capacity are reduced in nearly all clinical conditions. They are caned neutral because they do not force a decision, but they may increase its certainty. Working hypothesis (-= syndrome)

Classification of selected parameters pro: Raw, sGaw, FEV1, FEV1/VC neutral: PEF, MMEF, RV, VC contra: FEVflVC (+ !), TLC (-!)

Bronchial obstruction

A serious problem is that all these parameters show different trends which seem nearly incomparable. It is impossible and useless to compare variations of airways resistance and forced expiratory volume because their trends have different signs. This problem can be solved by a special normalization. Usually we compare the measured values to the predicted values only, and predicted values are the reference for health. If we want comparable results for all parameters we need a second cornerstone, namely the patient under worst possible conditions in clinical practice. Therefore we define functional disturbance FDi as follows: (M - R) FDi = - × 100 (%) (E R)

i: Raw, FEV1

-

M: measured value (for parameter i) R: reference value (i.e., predicted value) E: extreme value (clinically worst condition)

Interpretation System for Lung Function Tests

1195

This formula yields a quantitative description for the functional disturbances FDi because it accounts for the measured value, for the predicted value, and for an extreme value. FDi equal to zero denotes that the patient meets the predicted values, whereas FDi equal to 100% means that the patient is severely ill. Since we may simply estimate the extreme value under worst possible conditions, for instance as percentage of predicted values, our program can calculate a complete set of FDi values for all measured parameters. This set of FDi values for the given syndrome definition may be interpreted as a vector. This vector characterizes the actual patient, therefore we call it patient vector. The length L of the vector depends on the severity of the impairment, its direction may be typical for the syndrome in question: L = X/(EFDi2) -< ~

× 100%

The maximal length of the vector depends on the number N of components, whereas a suitable definition for a severity scale should be independent of this number. Consequently, we define mean severity FDm for k selected components a s ( i = 1 . . . k ) : FDm = ~/(1/kEFD 2) for k selected components The definition of mean severity FDm yields values from between 0 and 100% equivalent to the definition of functional disturbance for a single parameter. Additionally, it takes into account that we classified our parameters with respect to their conclusiveness. We distinguish mean severity for pro-parameters FDpro and for contra-parameters FDeontra, respectively. Both values are needed for the testing of the working hypothesis. If the mean severity for the pro-parameters is lower than a minimal threshold value Min, we are sure that the functional disturbances have no clinical relevance: Fdpro < Min If the mean severity for contra-parameters is greater than that for the proparameters, the data is inconsistent: FDpro < FDcontra In both cases the working hypothesis should be rejected. These estimations are based on scalar values, which are derived not from single parameters but from the complete set as classified in the syndrome definition. All vector components are used together to recognize a nontypical pattern. First, the patient vector is normalized to a length of 100 units by multiplication of its components by a suitable normalization factor. By this step we neglect the vector length as a measure for severity and restrict on the direction of the vector. Then it is compared to a typical syndrome vector, which represents the statistical data of all patients, for which this working hypothesis has already been verified. The comparison is based on the vector difference between both vectors; if the length of the difference vector is greater than a vector which represents the acceptable tolerance of the syndrome vector, the patient vector

D. Heise et al.

1196 Porazeter

FSi-PU

nPU

DU

SO

lO

[ FEUl-/UC

5q

27 '12 tt3

~ qO 32

s

84t 85

-9 I0 8

13 20

[ili:iiiiiiii i ::~il!iiilii!!i!iii:~i]

~

tl~

5

39

12

[:ii: i :::[:.ii!i: i:i]

0

0

-3

8

I1

19~

100

23

96

37

pq FEUI"

c ( TLC Uector len9th

Hypothesis ~ = e p t e d :

nPU

SO

~i~i:: [ii:!:!:i~:,ii i :,iii ii:ii:~ii!ii:!!iiiii~iiiii!ii!ii!l /

Fl)pro " 25 Z > F1)~ontra - 0 Z

0 - no - 12 - s l i g h t - qO - moder(rLe - 73 - severe - 100 Z

U ~ r l ~ l i z a t ion:

Severe

F$i-PU

Potterer

ILC

~1

lO 22

tic

65

37

3

~

50

.fiEF F-EUI'AIC +

95 0

5~ 0

20 -I0

31) 6q

25 20

-q

~

18

.So. FEUI'/OC-

85 5q

'18

'.~

2

?

31

15

1G

11

Uector length

177

IQG

7"/

~

"t8

c

HtJpothe~is

falsified:

-~

SO

PEF n

0

I)U

56 3q

P

0

bronchi

o]

obstruct

N~

i on

SO

17

F-Dpro " 37 Z (( FOcontra - 77 Z

0 - no - 17 - ? - 35 - s l ight - tt~) - moder c((e - (;8 - s e v e r e - lO0 Z

U e r l ~ l iz~: ion:

NO r e s t r i c t

ion

Fig. 1. Calculation examples for the syndromes "bronchial obstruction" and "restriction" (see text).

differs from the syndrome vector and therefore does not match the typical pattern. Under this condition, the program sends a warning message to the user in order to get his decision. If on the other hand the working hypothesis cannot be falsified, the program outputs a verbalization of its calculation results. For this reason, the mean severity of pro-parameters is compared stepwise to a series of limits which code intervals corresponding to text elements. Neither the limits nor its number nor the text elements are fixed elements of the program; they are chosen by the user during definition of the syndrome. Decision FDpro < FDpro < FDpro < FDpro ->

corresponding verbalization, if true: Limit Limit Limit Limit

[1] [2] [3] [3]

No bronchial obstruction Slight bronchial obstruction Moderate bronchial obstruction Severe bronchial obstruction

The above mentioned minimal threshold value Min for FDpro is identical to Limit [1]. The described method can be demonstrated by two examples (Fig. 1). The first syndrome is "bronchial obstruction." The definition lists three pro-param-

Interpretation System for Lung Function Tests

1197

eters, four neutral parameters, and one contra-parameter. Thus the patient vector consists of 8 FDi-components. The normalized patient vector (length about 100 units) and the statistically derived syndrome vector are represented graphically. A nearly symmetrical graph demonstrates that both vectors have similar directions. The length of the difference vector is much less than the length of the tolerance vector. The severity for pro-parameters is high, the contra-parameter (reduced total lung capacity) is negligible. The working hypothesis cannot be falsified, the appropriate interval for FDp~o yields severe bronchial obstruction. The same measuring data applied to the working hypothesis "restriction" result in an irregular graph. The working hypothesis is falsified because the contra-arguments overcome the pro-arguments. Since restriction is sometimes difficult to verify, we introduced a fourth interval for verbalization, "questionable restriction," indicated by a question mark. As already mentioned, the limit values used for verbalization are not constants but variables. Thus the program has a certain learning capability, ff the user does not accept the proposed verbalization by choosing an alternative, the program modifies the limits between the intervals corresponding to the verbalization text. The syndrome vector SV ("typical pattern") is calculated as mean value from all patient vectors for which the syndrome has been verified. Additionally, the tolerance vector TV is derived as a standard deviation of the mean. In the learning mode, the program adapts itself to the habits of the user. Certainly, the program will tend to imitate decisions, even if they are wrong. But we should not forget that a student learning in good faith from a chaotic teacher will do the same because his knowledge base is not stable.

Results

We got some practical experiences with this expert system. First, we tested the flexibility of syndrome definitions. We could simulate a simple decision tree, as given for example in the well known program of Ellis et al. [1], and as well a complete set of about 40 syndromes including all measuring methods as spirometry, body plethysmography, blood gas analysis, and CO-transfer. A minimal modification of the definition of functional disturbances allows for interpretation of functional changes under therapy, preferably in bronchial dilatation and bronchial provocation tests. We got quantitative results in a few days because in our program neither the limits nor the text elements for a syndrome nor the text elements for the verbalization of the severity of the functional impairment are fixed. The development time for a comparable program with a fixed structure can be estimated to be months to more than a year. Then we used the learning capability to get a quantitative model for the experience of four physicians with different education [2]. The lung function data of the same group of 110 patients was interpreted, and subsequently the knowledge base of the program was analyzed. The results were comparable

1198

D. H e i s e et al.

~i

.

o

.

.

.

.

to

, slight

.

2o

(13)

.

.

.

.

.

30

.

.

.

.

.

.

.

qo

~.--:.-.~.~.:.:t i i o d e r o t e

.

.

.

50

(lq)

.

.

.

.

so

~

severe

7"3

7o

H

(16)

Fig. 2. Learning effect for verbalization limits: The working hypothesis "bronchial obstruction" has been rejected for 29 of 72 patients. The rest of the patients (43) have been classified according to the severity of bronchial obstruction. All limit values are slightly reduced as compared to the starting values.

only for well established syndrome definitions as bronchial obstruction or respiratory insufficiency. The results for syndromes as restriction or suspicion of emphysema doubtlessly depended on "schools" of pneumology. The degree of functional disturbance can be used to estimate earning capacity (in Germany "Minderung der Erwerbsf'~higkeit," MdE) in coal miners with silicosis, if we choose a set of parameters which has been recommended by a national committee of silicosis experts [3]. The rest of this paper is concerned with the learning capacity of our program in order to demonstrate some aspects of its "artificial intelligence." We tested the working hypothesis "bronchial obstruction" in a group of 72 patients (Fig. 2). The four intervals for "no obstruction," "slight obsti'uction," "moderate obstruction," and "severe obstruction" had first been chosen with the same span. In other words, the limits for verbalization were preset equidistantly on 25, 50, and 75%. By estimation of the severity of bronchial obstruction from the mean functional disturbance FDpro we found bronchial obstruction in 43 patients, whereas in 29 patients the working hypothesis had to be rejected. During the estimation process for the whole group, the limits between the intervals were modified down to 12, 40, and 73%, respectively. It must be mentioned that this learning effect depends slightly on the starting values. If we started with 100% for the upper limit, we reached 75%, but starting with 50% yielded 69% for the upper limit when the data of all patients was interpreted. It is obvious that the program can learn only if there are appropriate patients which force a modification of the actual limits in a certain direction. The second learning effect concerns pattern recognition. Figure 3 lists four of the eight parameters which were used in the definition of the syndrome "bronchial obstruction": The forced expiratory volume in the first second FEV~ and the specific airways conductance sGaw are pro-parameters, the mean mid-expiratory flow M M E F is a neutral-parameter, and the total lung capacity TLC is the only contra-parameter. The syndrome vector SV is built componentwise as mean value from normalized patient vectors. The tolerance vector TV represents the appropriate standard deviation of the mean. Both

Interpretation System for Lung Function Tests

1199

tlO~ 13 . . . . . . . . .

0

,

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

i

. . . . . . . . .

i

. . . . . . . . .

=

.

.

.

.

.

.

.

.

.

.

.

.

.

10

2O

30

',,lO

50

i ......... tO

| ......... 2O

i ......... 3O

i ......... ~

i .................. ~0 6O

.

.

.

.

.

i - -

~

7O

N

56+13 ......... 0

=I .... . o

i-7O

.... zo

H

-,_c 20

~

~o

~o

"~H

6:1: t l

Fig. 3. Learning effect for selected components of the syndrome vector SV (mean, rectangles) and the tolerance vector TV (standard deviation of the mean, tolerance markers). The final results are listed at right.

l(I)U

50

tO

....

~

30

~0

50

,SO

Hi

. . . .

70

Ui

50

I(KI] a c c e p t e d (tt3)

L'"

i rejected

(29]

Fig. 4. Length of the difference vector (normalized patient vector nPV minus syndrome vector SV) and length of tolerance vector TV for all 72 patients. SV and TV are recalculated only if a patient has been accepted.

vectors represent only data of patients for which the working hypothesis " b r o n c h i a l o b s t r u c t i o n " has been accepted. The m e a n value for FEV1 is stable within the first ten patients, but its standard deviation is increased at patients nos. 46 and 57. The m e a n value for sGaw and its standard deviation vary considerably. The M M E F yields the highest m e a n value and a relatively small standard deviation which is increased b y patient no. 14 but reduced again by the following five patients. E v e n though, M M E F cannot be used as a pro-parameter since it shows similar values for instance in respect to the working hypothesis " r e s t r i c t i o n . " F o r T L C , the standard deviation is nearly doubled in respect to the m e a n value because this p a r a m e t e r works as a contra-argument.

1200

D. Heise et al.

It is somewhat difficult for a human being to look for up to 15 components simultaneously. In fact, it is usually sufficient to calculate the length of the tolerance vector in order to compare it with the difference between the patient vector and the syndrome vector. Figure 4 demonstrates that the tolerance vector reached its final length after less than ten accepted patients, the following variations are negligible. Usually, the decision is simple since the difference vectors are either short or very long as compared to the length of the tolerance vector.

Discussion We may conclude that our expert system is a very flexible tool for reliable and fast interpretation of lung function test results. All its components as text elements or parameter constellations are variables which may be modified by the user. The use of parameter sets instead of single pbxameters allows for a synoptical interpretation. The criteria for severity staging and pattern recognition may be automatically adapted to the habits of the user. The learning process can be reversed; in a special mode, the program can be used as a trainer for the inexperienced user. The knowledge of the program can be used for further development of clinically relevant syndrome definitions in order to get a widely acceptable base for standardization of interpretation of lung function test results. Frequently appearing parameter constellations define the syndrome vector and thus are handled automatically by the program, whereas borderline cases are recognized as such and have obviously a great influence on this knowledge, if the user decided "ex cathedra" that the syndrome definition is fulfilled. The sudden increase of the standard deviation demonstrates, that such patients enlarge the "knowledge" of our expert system. Now the program uses altered mean values and a preferably higher tolerance, it is not as narrowthinking as before! That is our view of artificial intelligence: An expert system works as an intelligent tool for boring, routine work, and it should additionally enlarge human intelligence and human knowledge by a suitable feedback. Under this aspect there is no reason for fear of computers in medicine.

References 1. Ellis JH, Suncana PP, Levin DC (1975) A computer program for calculation and interpretation of pulmonary function studies. Chest 68:209-213 2. Heise D, Schneider S, Woerner H, Brzostek D (1989a) Klinische Anwendung eines Expertensystems zur Interpretation yon LungenfunktionsprOfungen--Abh~ingigkeit der Ergebnisse vom jeweiligen Anwender. Pneumonologie 43, im Druck 3. Heise D, Sehlimmer P, Schmidt H, Sybrecht GW (1989b) Silikose-Gutachten: Vergleich zwisellen Schwere der Funktionsst6rungen und zuerkannter Minderung der Erwerbsf'ahigkeit Atemwegs und Lungenkrankheiten 15:254-255

An expert system for synoptic interpretation of lung function tests.

We simulate the interpretation process by the testing of preformed working hypotheses. A clinical syndrome, "bronchial obstruction," is described by a...
461KB Sizes 0 Downloads 0 Views