Can Artificial Neural Networks Provide an "Expert's" View of Medical Students Performances on Computer Based Simulations? Ronald H. Stevens, Ph.D. Katayoun Najafi

Dept. of Microbiology & Immunology

UCLA School of Medicine ABSTRACT Artificial neural networks were traineld to recognize the test selection P)attei-ns of stuldents' successful solutions to seven immunology computer based simullations. When new studenit's test selectionis were piresented to the trained neural network, their p)roblem solutions were correctly classified as successful or non-successful >90% of the time. Examination of the neural netwsorks output weights after each test selection revealed a progressive increase for the relevalnt piroblem suggesting that a successful solution wilas rep)resented by the nteurlal network as the accumulation of relevant tests.

patterns of focused test selections centered in the relevant content area whereas unsuccessful solutions display patterns more consistent with searching and failure to recognize/interpret relevant data.

While providing insight into how students approach and search for problem solutions, the use of this information for improving performance has been difficult. For instance, the search path map patterns are generated after a student has completed a problem reducing the ability to detect misconceptions during the problem solving process. Additionally, for maximum usefulness, the search path map patterns must be viewed and interpreted by an expert. The presence of distinctive patterns of successful problem solutions suggested however that a variant of search path mapping could be implemented utilizing artificial neural network.

Utnsuccessfidl problemx solutionis ieveled two p)atterns of students petformtances. 1Thefirst pattern was characterized by low neural network output weightsfior all seven problems reflec ting extensive se(arching an(d lack of recognition of relevant informationi. In the second patternl, the output weights from the neural network were biased towards one of the renwiniilg six incorrect problems suggesting that the stuldent mis-represented the current problem as an instance of a previous prioblem. INTRODUCTION The rapidly emerging information technologies and the increased emphasis on problem based learning are converging forces which are predicted to have powerful effects on medical education and learning [1]. Deriving maximum educational benefit from these instructional developments will require parallel evolution of new strategies to evaluate student performances within these formats.

Our previous studies have shown that successful and non-successful problem solutions can be readily distinguished by search path mapping where the progression of individual students, or groups of students during the problem solving process is logged and graphically displayed [2,3]. Successful solutions are characterized by

0195-4210/92/$5.00 0 1993 AMIA, Inc.

In this study we have trained artificial neural networks to recognize the patterns of successful student solutions to problems and have shown that when presented with new student problem solving performances, these networks successfully classified the problem solving outcomes.

METHODS Problem Solving And Search Path Mapping Analysis Software The software 'IMMEX: Problem Solving Exercises in Immunology" has been used for evaluating students for 5 years (150 students/year) and has been previously described [2,3]. As students progress through the problems a log is kept of the test selections, time between tests, diagnoses, etc. which allows the reconstruction of the problem solving process. SQL queries can be made to this database regarding the selection of tests by individual students or groups of students based on scores, problems, times spent on a problem, correct, or incorrect solutions, etc. In reconstructing the problem solving

179

process, the IMMEX::ANALYSIS software presents graphical representations where the potential test selections available to students appear as small rectangles with lines connecting the sequence of tests selected. An individual search path map is a visualization of the sequential test selections made by a student in performing a simulation. In a group search path map multiple individual search path maps are combined; the individuality of a student then becomes lost but the most frequent combinations of test selections by the group satisfying a specific query to the database is obtained. It then becomes possible to compare one students performance with that of groups selected by a variety of criteria.

Construction Of Artificial Neural Networks

Supervised neurl networks build models which classify patterns or make decisions according to patterns of inputs and outputs they have "lcarned" from training on a variety of problem patterns. The supervised neural networks used in these studies were locally prograimmed for the Microsoft Windows® environment using neural network software libraries obtained from Ward Systems Group Inc., Frederick Maryland. Multiple neural networks were constructed which differed in the number of classifying characteristics, hidden neuronis, momentum, learning rate and final learning errors. The network used for these studies had a three-layer structure and consisted of 533 input neurons, 20 hidden neuronis and 7 case specific output neurons, which were fully interconnected by weighted liniks; the momenitum was 0.9, the learning rate 0.6 and the network was trained to a 0.005 sum of errors. The first step in generating the neural net was to obtain training data for classifying each of the 7 probliems. This consisted of the test selections of students who successfully solved a problem. Similar values were obtained for each of 7 separate immunology case simulations. This process resulted in total of 533 classifying characteristics of test selections and the frequency with which they were chosen. These values constituted the training data. The ability of the network to classify individual students performances on a problem was then tested using student data obtained under examination conditions 2 years previously. Output weights, which can be loosely interpreted as the probability of a certain problem being solved, can range from 0 to 1 and those greater than 0.5 were arbitrarily considered positive.

RESULTS Artificial Neural Networks Can Be Trained to Recognized Patterns of Successful Student Problem Solutions Previous search path mapping of students performances revealed that there is no particular pathway to the solution of a problem. In fact, in several thousand simulation runs, few students have followed the same sequence of test selections [2]. The correct case solution is not likely to be obtained, however, unless key features of the problem are recognized and certain relevant tests selected. This process leads to the formation of group search path map "patterns" for successful case solutions. The distinctiveness of search path map patterns for successful problem solutions suggested that artificial neural networks could be trained to recognize these characteristic features. When subsequently presented with an individual students performanice on a particular problem these trained networks would compare the new solution pattern with that modeled in the neural network, determine the closeness of pattern matching, and output whether or not a student was classified as having solved or missed the case. To explore this hypothesis, test selections were derived from the search path maps of 47 students who correctly or incorrectly solved an immunology case and were entered as testing characteristics into neural networks pre-generated as described in Methods. The classifying output weights for the 7 different problems were obtaiined and representative data from 17 students are presented in Table 1.

The successful student solutions were strongly classified by the neural network software with the majority of the linal output weights (ie. at the end of the case) for the correct problem exceeding 0.8 and the majority of the output weights of the remaining 6 problems below 0.2. The results in the t1able display the final output values after all the students test selections were entered into the neural network. It is also possible to perform the neural network analysis following each test selection rather than only at the end of the problem. Such an analysis for two successful student solutions indicates that there is often a progressive increase in the output weights for the relevant problem following each test selection suggesting that the successful case solution is represented by the neural network} as the accumulation of relevant test selections. Figure 1. There are several useful parameters for characterizing how well the neural network is performing, most nota-

180

KAPPA CHAIN ENHA

NEURAL NETWORK CLASSIFICATION OF CORRECT AND NON-CORRECT PROBLEM SOLUTIONS

_

STUDENT S

2 3 4 5 6 7 8 9 10

CORRECT SOLUTIONS

INCORRECT SOLUTIONS

11

12 13 14 1s 16 17

CASES Knh Room IL2G* CD3d MHCPr CD8D9 2Mic

79 Li

iL-2 DEFICt _TE

0.11

.27

.16

.10 89 90 9

.13 .19 .13 .23

_

.32 .24 .50 .30 .18

9 .94 92

.12 .22

.11 .16 .18

.18

1

cat De

BETA-2 .11

.1 1

r

Figure 1. Case Specific Output Weights Following Sequential Test Selections of Successful Problem

Table 1. The test selections made by a) 10 students who solved a problem or b) 7 students who missed a problem were input into the tr-ained neiural network and the output weights for the seven cases were deternmned. 1) The correct solutiotn to) a p)roblein is boxed by a rectangle. 2) Spaces a(re left blank if the output weights were less than 0.1

Solutions

The first test selection was given an input weight of I and entered test lata into the trained neural network. The resulting output weights for each of the cases were theni obtainied and the process was repeated for each subsequent test selection until the completion of the pr-oblem. For each case category the top bar is the output weight following the first test selection and the lowest bar is the output weight following the last as

KAPPA CHAIN ENHANCER

RECOMNASE DEFECT L-2 DEFICIENCY.

CD3 DEFECT

111

.12 .12

.57 I

.MHC PROM

.52

.21 .

.12 .10

_~~~~~~

test selection.

*** indicates the case being solved and correctly identified 21/21 non-successful students performances for a specificity of 100%. There were four instanices where the neural network failed to detect a positive outcome. If the output weight required for significance were reduced to 0.4, two of these false negatives would become positive. The high sensitivity and specificity of the neural networks performance on individual problems suggests that it is operating as intended in classifying a students problem solving performance.

X-~~~~~~

_~~~~~~~~~~-II*

MHN PROMOTER

i"

CDB DEFECT

~~~~~ BETA-2- MICRO

o02XAQ6 o OUTPWZi

,0.8

i 'I.

(Figure la)

The second way to examine the networks performance was across problems. Often while the output weights from the neural network correctly identified the student's missing a specific case, the output weights were above 0.5 for one of the non-correct additional 6 cases. For instance, in Tablel while student #16's incorrect solution to the CD3 defect had an output weight of less than 0.1, the output weight for the MHCPr problem was .57. This results in a surprisingly high false positive rate. This high fallse positive rate across problems suggested that there may be multiple ways for a student to "miss" a problem.

bly sensitivity and specificity. Sensitivity [TP/(TP+FN)] is the likelihood that an event will be detected given its present. Specificity [TN/(TN+FP)J is the likelihood that the absence of an event will be detected given that it is absent. These parameters were applied to the neural networks perfonnance on either an individual problem basis or across problems. The per-problem basis was used for classifying whether a particular problem was or was not correctly solved by a student. In comparing the neural networks classification of a student's performance, with that student's actual performance, the neural network correctly classified 22/26 students correct problem solving performances for a sensitivity of 86%

181

Why Do Students Miss Problems? KAPPA CHAIN ENHANCER

The problem solving process can be divided into the generation of hypotheses, the gathering of information, the integration of information and the evaluation of hypotheses in light of the new information. The inability to formulate a hypothesis would be reflected in the failure to choose relevant data in a directed manner. This would be revealed by extensive searching for information and failure to recognize relevant data when chosen. We have previously observed this form of problem solving difficulty using search path mapping [2,31. When such students performances are processed by the neural network on a test by test basis, as expected the output weights show little selectivity for either the case being solved or any of the additional cases. Figure 2a. KAPPA CHAIN ENHANCEF r RECOMBINASE DEFECI IL2 CD3 DEFECI MHQ PROMOTEi CD DEFE:

RECOMBINASE DEFECT IL-2 DEFICIENCY

CO3 DEFECT MHC PROMOTER. CDW DEFECT

B3ETA-2 MICRO 0

(Figure 2b) Figure 2. Case Specific Output Weights Following Sequential Test Selections of Non-Successful Problem Soltutions The data are p)resentecd as in Fig. 1. from a large number of example pattems with which it is trained.

Our prior years experience of examining students' problem solutions by search path mapping revealed that patterns of certain test selections were often associated with efficient problem solving. We have extended these studies and have shown that artificial neural networks trained on successful patterns can accurately distinguish individual students successful and non-successful problem solutions.

i

BETA,2 MICR( 0

0.4 02 0.6 OUTPUT WEIGTS

0.2

0.4

0.6

0.6

I

OUTPUT WEiGHT

(Figure 2a) A similar test-by-test neural network analysis was next used to examine those non-successful solutions which were contributing to the false positive results across problems. A representative case (Figure 2b), indicates that the incorrect solution was progressively accumulated throughout the case.

One major consideration in the construction of the list of classifying characteristics was how many sequential test selections to include to construct the neural network.

These results suggests a directed, yet inappropriate, approach to the case solution may often contribute to a student missing a problem.

DISCUSSION Artificial neural networks have been successfully used for the solution of complex problems which are difficult, if not impossible, to solve by other currently known methods [4]. We have explored the potential of using artificial neural networks, traiined on successful problem solutions, to begin to function as an expert in the analysis of students' problem solving approaches. Traditional rule-based expert systems rely on heuristic associations between findings and outcomes [5-7]. Unlike an expert system, a neural network doesn't require the definition of specific rules to use in solving a problem; the network discovers its own rules as it learns

182

The students progression through problems is highly individualistic with few instances of identical search paths. This presumably relates to a particular students knowledge, interpretation of the problem and the data already obtained. With 45 test selections available, however, the potential number of different sequential test selections could be high, even for problem solutions where only 5 tests are selected prior to the problem solution ( i.e., 45 5 * 1.8 x 10 8 test selections). The probability of students selecting test combinations not included in the classifying characteristics can therefore be high and would result in incomplete test data being presented to the neural network. This indeed occurred and as could be expected, there were more instances of missing data with poor problem solving performances (35% of test selections not being included in the classification characteristic) than with successful performances (15%). This missing data may in fact represent the gathering of unhelpful data which as pointed out by Grup-

approach for visualizing problem-solving behavior. Academic Medicine 1991:S73-75.

pen, et al [8] makes it difficult to produce an accurate decision even if optimum integration of the findings occurs. One of the major advantages of neural networks, however, is that they are highly fault tolerant, and can derive information from incomplete, noisy or partially incorrect cues. Preliminary studies suggest that the performance of our neural network} is not significantly degraded by the absence of up to one-third of the test selection.

4. Rumelhart DE, McCelland JL. Parallel distributed processing: explorations in the microstructure of congnition. Vol. 2. Psychological and biological models. Cambridge, Mass.: MIT Press, 1986. 5. Feigenbaum EZ. Knowledge engineering for medical decision making: a review of computer-based clinical decision aids. Proc IEEE 1979;67:1207-24.

One interesting feature of the neural network analysis of unsuccessful problem solutions was the demonstration of two distinct categories of student performances. In the first category the neural network did not identify the students test selections as belonging to any of the 7 potential problems. This type of problem solving performance is what would be expected from individuals with little domain specific knowledge; if relevant information cannot be distinguished from non-relevant information, a focused strategy for correctly solving the problem is difficult. The second category, the correct solution to the wrong problem, was one which had been previously difficult to quantitate based on search path analysis alone due to the need of an experts interpretation of this data and the experts perception of how students view the correspondence among problems. The frequency with which students may view a new problem as an instance of an earlier problem suggests that approaching a problem by analogy or through bias often occurs. If so, then providing continuous neural network outputs as feedback may reduce the use of incorrect analogy and highlight the distinctions among problems while integrating and strengthening the links among

6. Miller RA, Pople HE, Myers JD. Internist-1: an experimental computer-based diagnostic consultant for general internal medicine. N Engl J Med 1982; 307:46876. 7. Wolf FM, Gruppen LD, Billi JE. Differential diagnosis and the competing-hypotheses heuristic: a practical approach to judgement under uncertainty and Bayesian probability. JAMA. 1985;253:2858-63.

8. Gruppen, LD, Wolf, FM, Billi, JE. Information Gathering and Integrationi as Sources of Error in Diagnostic Decision Making. Medical Decision Making, 1991; 11:233-239.

concepts.

Acknowledgments was supported in part by USPHS Grant This project #1D31AH59004. The authors would like to thank Robert J. Syarto for expert programming, Tony Kwak and the Instructional Microcomputer Facility staff in the Louise Darling Biomedical Library for implementation of the IMMEX software, and Marshall Sherman for

skilled editorial assistance. References 1. Piemme, TE. Computer-assisted learning and evaluation in medicine. JAMA 1988; 260:367-72. 2. Stevens, RH, McCoy JM, Kwak, AR. Solving the problem of how medical students solve problems. M.D. Computing 1991:13-20. 3. Stevens, RH. Search Path Mapping: A versatile

183

Can artificial neural networks provide an "expert's" view of medical students performances on computer based simulations?

Artificial neural networks were trained to recognize the test selection patterns of students' successful solutions to seven immunology computer based ...
920KB Sizes 0 Downloads 0 Views