Serial spatial reversal learning in rats: comparison of instrumental and automaintenance procedures.

Physiology & Behavior, Vol. 50, pp. 1145-1151. PergamonPress plc, 1991. Printed in the U.S.A.

0031-9384/91 $3.00 + .00

Serial Spatial Reversal Learning in Rats: Comparison of Instrumental and Automaintenance Procedures P H I L I P J. B U S H N E L L 3 A N D M A R K E. S T A N T O N

Neurotoxicology Division, U.S. Environmental Protection Agency, Research Triangle Park, N C 27711 R e c e i v e d 2 4 S e p t e m b e r 1990 BUSHNELL, P. J. AND M. E. STANTON. Serial spatial reversal learning in rats: Comparison of instrumental and automaintenance procedures. PHYSIOL BEHAV 50(6) 1145-1151, 1991.--Serial reversals of a spatial discrimination were trained in rats under automaintenance conditions, in which food reward occurred regardless of responding. This automalntained reversal learning was compared to instrumental reversal learning in other rats trained under a similar procedure which required responding for reward. In the automaintenance (AU) procedure, rats received food after every retraction of a "positive" response lever (S÷); retraction of a second, "neutral" lever (S°) was not paired with food delivery. Responses to the S + were elicited at fairly constant rates during daily 100-trial conditioning sessions. Responses to the S° occurred early in each session but rapidly diminished across trials. When the valences of the levers were reversed, responding shifted to the new S + and diminished on the new S°. Criterion for reversal was defined as a discrimination ratio (DR) of at least 90% responding to the S + in two consecutive 10-trial blocks. With repeated reversals, acquisition of criterion performance occurred with increasing rapidity, reaching an asymptote below that required for the original discrimination. A second group of rats was trained on a similar instrumental schedule, in which at least one response to the S ÷ was required for food delivery, Response rates in this instrumental ON) group were approximately double those of the AU group. However, ratios of S ÷ to S° response rates were similar to those of the AU group, and the serial reversal curves generated were qualitatively similar. Thus rats can show improvement across serial reversals of a spatial discrimination based entirely on pairings of stimulus events (automaintenance), in a manner similar to that observed in instrumental procedures, in which reward is contingent upon correct responding. Automaintenance Instrumental conditioning Pavlovian conditioning Repeated acquisition Serial reversals Spatial discrimination

TRADITIONALLY, studies of reversal learning have employed instrumental learning procedures; studies of serial reversal learning that have used other conditioning procedures are rare or nonexistent. Reversal of classically conditioned nictitating membrane responses in rabbits has been used to evaluate the role of hippocampal lesions (14), but only with a single reversal. Since Pavlovian learning procedures have facilitated solution of theoretical problems in the analysis of instrumental (operant) discrimination learning procedures [e.g., (12,19)], analysis of serial reversal learning may also benefit from this approach. It was therefore of interest to determine whether the changes in rate of reversal learning typical of instrumental tasks could be obtained using a classical conditioning approach. Automaintenance, in which responding to a signal for reward is elicited by pairing the signal consistently with the occurrence of that reward, was used for this purpose. In typical studies of this type, rats press a single lever, the retraction of which reli-

Rat

Reversal learning

ably precedes food delivery; these presses are maintained by the pairing of the lever and the food (9,18). In contrast to instrumental tasks, no response is required for delivery of food; thus this conditioning procedure lacks any response-reward contingency, and relies instead upon stimulus-reward pairings to elicit responses from the animal. We have shown previously that automaintained responding can be used as an index of reversal learning (3,4). In this work, two levers were repeatedly inserted into a test chamber for a brief time interval; the retraction of one lever reliably preceded pellet delivery, while that of the second did not. Under these conditions, rats emitted many responses to the first lever (S ÷) and few if any to the second lever. Because its retraction was temporally uncorrelated with pellet delivery, this second lever was called an S ° (rather than an S - ) . When the contingencies between the levers were reversed, the rat's behavior shifted toward the new S ÷ and away from the new S °, resulting in an

1The research described in this article has been reviewed by the Health Effects Research Laboratory, U.S. Environmental Protection Agency, and approved for publication. Approval does not signify that the contents necessarily reflect the views and policies of the Agency nor does mention of trade names or commercial products constitute endorsement or recommendation for use. 2A portion of these data was presented at the annual meeting of the Society for Neuroscience, Phoenix, AZ, November, 1989. 3Requests for reprints should be addressed to Philip J. Bushnell, Ph,D., Neurotoxicology Division, MD-74B, U.S. Environmental Protection Agency, Research Triangle Park, NC 27711.

1145

BUSHNELL AND STANTON

1146

automaintained reversal. After a number of reversals, acquisition rates reached a steady state, in which a reversal was completed within 30-50 trials of a 100-trial session. Since reward was not contingent upon the occurrence of a response, differential reinforcement for choice accuracy did not occur. Previous reports with this automaintenance procedure utilized asymptotic reversal performance as a baseline against which the effects of p-xylene (3) and trimethyltin (4) were evaluated. The purposes of this paper were 1) to demonstrate that repeated reversals of a spatial discrimination under automaintenance conditions yield progressive improvement in learning individual reversals, and 2) to compare reversal learning in this task with that generated by an analogous instrumental procedure. METHOD

Subjects Thirteen male Long-Evans rats (Charles River, Raleigh, NC), 5 to 6 months of age at the beginning of behavioral training, were housed individually in suspended plastic cages on heattreated, shaved-pine bedding, under a 12:12-h L:D photoperiod with light onset at 0600 h. All testing occurred during the light phase of the cycle. Each animal was maintained at 350 g body weight by scheduled food (Ralston Purina, St. Louis, MO) delivery l to 6 h after daily behavioral testing; water was available ad lib in the home cage. The two groups of rats consisted of the control groups from two ongoing studies of the effects of repeated oral styrene exposure on behavior. All rats in this study received daily oral doses of corn oil vehicle (1.0 ml/kg b.wt.), 5 days/week for 8 weeks, ending 2 months prior to the beginning of reversal training. Testing of the instrumental group (IN: n = 7) preceded that of the automaintained group (AU: n = 6) by 3 months.

Apparatus Eight standard rat operant conditioning chambers (Coulbourn Instruments, Lancaster, PA) were each equipped with two retractable response levers 3.4 cm in width, mounted with inside edges 13 cm apart on one wall of the chamber. The levers were extended and retracted pneumatically with travel time of approximately 200 ms. The levers were modified to register depressions of less than 0.30 N; within a given chamber, the force varied by no more than 0.05 N. A cue light was mounted immediately above each lever. A food cup with a swinging plastic door (Campden Instruments, via Stoelting, Chicago, IL) was centered between the levers. A microswitch attached to the door registered nosepokes into the food cup. Each chamber was located in a sound-attenuating shell within which white noise (80 db with flat attenuation, measured at the opening of the food cup) was provided. Reinforcers were 45-mg pellets (Bio-Serv, Frenchtown, NJ). Control of stimuli and recording of responses were accomplished by computer (PDP8/a, Digital Equipment, Maynard, MA) and SKED interface with SUPERSKED software (State Systems, Kalamazoo, MI).

Initial Training Rats were trained to press one lever using a combined autoshaping-operant protocol described elsewhere (3). In brief, the lever was extended into the operant chamber and lit by its cue light. In the absence of a leverpress, the lever was retracted after 15 s and a pellet was delivered to the food cup. A leverpress during the 15-s period caused immediate retraction of the lever and pellet delivery. Trials were separated by a variable-time 45-s

intertrial interval (ITI).

Reversal Learning Instrumental group: Single-lever training. Each rat reliably pressed the response lever for food on the autoshaping-operant schedule within 18 50-trial sessions. Beginning with the 19th session, the lever remained inserted for the full 15-~; trial period, regardless of the occurrence of a response. If at least one press (R +) occurred, a pellet was delivered upon retraction of the lever at the end of the 15-s period. If no press occurred, no pellet was delivered. Twenty 100-trial sessions on this schedule were administered, followed by a series of 10 sessions involving e~tinction (data not reported). Instrumental group: Two-lever training. Beginning with the 53rd test session, the second lever (S'~I was extended into the test chamber. The contingencies between the S ~ , R ~ , and pellet delivery continued unchanged. The S° was presented on a schedule with identical intervals as that of the S ~, but S" presentations were temporally uncorrelated with the S ' schedule and with pellet delivery. Since pellet delivery was randomly timed with respect to the S ° , it could by chance follow retraction of the S ° on some trials. Each press to the S '~ (R"J was recorded but had no programmed consequence. The average value of the interstimulus intervals for both S ~ and S ° was reduced to 30 s (range, 1.6 to 99.1 s) at this time as well. Five 100-trial sessions on this schedule served to define the original discrimination (OD) between the S + and the S °. Automaintenance group: Single-lever training. When all rats reliably emitted leverpresses (16 50-trial sessionsL the schedule was modified such that a leverpress no longer caused lever retraction, and a pellet was delivered at the end of the 15-s period regardless of the rat's behavior. As for the IN group, daily sessions were 100 trials on this automaintenance schedule, which remained in effect for 16 sessions. A series of tests of the effects of extinction (8 sessions) and changes in probability of reinforcement (12 sessions) followed (data not reported). Automaintenance group: Two-lever training. Beginning with the 58th test session, the S° was presented for the first time to the AU group. It was extended and retracted in an identical manner to that described above for the S° in group IN. The average interstimulus interval was reduced to 30 s for this group at this time as well. Also like the IN group, each animal received five 100-trial sessions under these conditions, which served to define the OD. Reversals A reversal was defined as a change in the designation of the S + and S °. At the beginning of a reversal session, the lever which was previously the S + was designated the S °, and vice versa. No cue for this change was provided to the animal. The first reversal was programmed at the beginning of the 6th twolever session for both groups. Lever designations were changed after all animals reached criterion (see below) on a given reversal, for a total of 16 reversals. Reversal 1 consisted of 500 trials, Reversal 2 of 600 trials, Reversal 3 of 400 trials, and succeeding reversals of progessively fewer trials, to a minimum of 100 trials (1 session). The OD and Reversals 1 to 9 were used to demonstrate the improvement in the rate of reversal learning as asymptotic learning was approached, and Reversals 11 to 16 were used to compare the groups' reversal performance after asymptotic reversal performance had been achieved.

Dependent Measures Response frequencies to each lever were summed across 10trial blocks for each rat. These response frequencies, f(R ÷) and

SPATIAL REVERSAL LEARNING SETS IN RATS

f(R°), were examined across trial blocks and sessions to quantify changes in response tendency toward the S + and S°. A Discrimination Ratio (DR) was calculated from these frequencies as the proportion of total responses in a given trial block directed toward the S +, i.e., DR=ffR+)/[(f(R +) + f(R°)]. The learning criterion for each reversal was a DR at or exceeding 0.9 for two consecutive 10-trial blocks. To characterize improvement in reversal learning, the number of trials to reach this criterion was counted for each rat. Trials to criterion were then averaged across rats within groups and plotted as a function of reversal. To characterize asymptotic reversal learning, daily reversals of the two groups were compared after asymptotic reversal performance had been achieved (Reversals 11 to 16). To quantify reversal acquisition at this stage of learning, parameters were estimated from these sessions by fitting nonlinear functions to DR values across 10-trial blocks for each rat within each reversal session. The equation o f Hull (11) was modified to calculate an asymptote A, a deviation from asymptote D, and a (negative exponential) learning rate R. The equation has the form DR = A - D(10-m), where DR is the Discrimination Ratio (y value) calculated for each 10-1rial block (x). X values were designated at the midpoint of each block (i.e., given values of 5, 15, 25 . . . 95 for the 100-trial session). With perfect discrimination, A is unity; the greater the deviation from A at the start of a reversal, the larger the absolute value of D; the faster the return to asymptote, the larger the absolute value of R (typically ranging from 0.01 to 0.20).

Statistical Analyses Analysis of variance [ANOVA: SAS General Linear Model (15)] with groups as a between-subject factor and reversals as a within-subject factor was used to assess improvement in reversal learning (trials to criterion). Frequencies of R + and R °, summed within trial blocks for the original discrimination, Reversal 1, and Reversals 11 to 16 combined, were subject to similar analyses with blocks and days as repeated measures. Group differences in the parameters of the nonlinear acquisition function, A, D, and R, were evaluated with t-tests. Greenhouse-G-eisser df corrections were applied as necessary to repeated-measures factors in all A.NOVAs; nominal df and correction factors (~) are reported for the F ratios obtained. The overall ot level for each ANOVA or t-test was 0.05. RESULTS

Reversal Training: Procedural Limitations During acquisition of Reversal 2, 2 of the 7 IN group rats ceased responding to the S° and did not begin responding to the S +. This extinction was remedied by continuing reversal training with 4 sessions using the AU procedure. Delivery of pellets in the absence of a R + reinstated responding to the S + in these rats. The animals were then returned to the IN procedure for the remainder of the study; however, their data were excluded from all analyses, leaving 5 rats in the IN group. Also during acquisition of Reversal 2, 2 of the 6 AU rats' leverpress rates fell close to zero. Observation of the animals showed that they attended to the S + and engaged in vigorous exploratory activity in its immediate vicinity, but did not emit detectable leverpress responses. In an attempt to shape their behavior to a topography more amenable to detection by the lever, these rats were placed on a single-lever variable-ratio (VR) schedule (1 day at VR5 followed by 1 day at VR10). Both rats

1147

responded appropriately on the VR schedule and continued to respond when returned to the AU schedule. However, their rates never approached those of the other rats in the group, and their behavior toward the manipulandum tended to drift back to its preferred mode over time. Their data, too, were excluded from all analyses, leaving 4 rats in the AU group.

Reversal Learning: Trials to Criterion The original discrimination (OD) was acquired in an equivalent number of trials by both groups (Fig. 1), and learning improved across reversals in both groups. However, different patterns of improvement were obtained: the IN group required far fewer trials to reach criterion on Reversal 1 than did the AU group, but the AU group reversed more quickly later (see below). Overall analysis of trials to criterion on Reversals 1 to 9 revealed a significant effect of Reversal, F(9,63)=14.47, ~=0.345, p

Drug effects under automaintenance and negative automaintenance procedures.

Serial reversal learning in bumblebees (Bombus impatiens).

The different effects of maternal separation on spatial learning and reversal learning in rats.

reversal learning in rats.

Olfactory discrimination, reversal learning, and stimulus control in rats.

Markers of serotonergic function in the orbitofrontal cortex and dorsal raphé nucleus predict individual variation in spatial-discrimination serial reversal learning.

Perseveration in a spatial-discrimination serial reversal learning task is differentially affected by MAO-A and MAO-B inhibition and associated with reduced anxiety and peripheral serotonin levels.

Nucleus incertus inactivation impairs spatial learning and memory in rats.

Daily running promotes spatial learning and memory in rats.

Dexmedetomidine protects spatial learning and memory ability in rats.

The Effects of Sex and Chronic Restraint on Instrumental Learning in Rats.

Feasibility of use of probabilistic reversal learning and serial reaction time tasks in clinical trials of Parkinson's disease.

Preserved configural learning and spatial learning impairment in rats with hippocampal damage.

Accounting for negative automaintenance in pigeons: a dual learning systems approach and factored representations.

Spatial learning ability of rats undernourished during early postnatal life.

Type VI adenylyl cyclase negatively regulates GluN2B-mediated LTD and spatial reversal learning.

Effects of MDMA on olfactory memory and reversal learning in rats.

Variability and discrimination reversal learning in the open field following septal lesions in rats.

Maternal separation induces alterations in reversal learning and brain-derived neurotrophic factor expression in adult rats.

Anabolic-androgenic steroids impair set-shifting and reversal learning in male rats.

Comparison of normal and learning disabled children on a nonverbal short-term memory serial position task.

Progressive decline in spatial learning and integrity of forebrain cholinergic neurons in rats during aging.

Testing organization preferences in serial pattern learning.

Influence of Long-Term Zinc Administration on Spatial Learning and Exploratory Activity in Rats.