Journal of Experimental Psychology: Learning, Memory, and Cognition 2015, Vol. 41, No. 5, 1388 –1403

© 2015 American Psychological Association 0278-7393/15/$12.00 http://dx.doi.org/10.1037/a0038853

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Procedural Learning During Declarative Control Matthew J. Crossley

F. Gregory Ashby

University of California, Berkeley

University of California, Santa Barbara

There is now abundant evidence that human learning and memory are governed by multiple systems. As a result, research is now turning to the next question of how these putative systems interact. For instance, how is overall control of behavior coordinated, and does learning occur independently within systems regardless of what system is in control? Behavioral, neuroimaging, and neuroscience data are somewhat mixed with respect to these questions. Human neuroimaging and animal lesion studies suggest independent learning and are mostly agnostic with respect to control. Human behavioral studies suggest active inhibition of behavioral output but have little to say regarding learning. The results of two perceptual category-learning experiments are described that strongly suggest that procedural learning does occur while the explicit system is in control of behavior and that this learning might be just as good as if the procedural system was controlling the response. These results are consistent with the idea that declarative memory systems inhibit the ability of the procedural system to access motor output systems but do not prevent procedural learning. Keywords: procedural memory, declarative memory, multiple memory systems, system interaction, category learning

2005). Much of the categorization research on declarative and procedural learning has used rule-based (RB) and informationintegration (II) category-learning tasks. In RB tasks, the categories can be learned via some explicit reasoning process. Frequently, the rule that maximizes accuracy (i.e., the optimal strategy) is easy to describe verbally (Ashby et al., 1998). In the most common applications, only one stimulus dimension is relevant, and the participant’s task is to discover this relevant dimension and then to map the different dimensional values to the relevant categories. A variety of evidence suggests that success in RB tasks depends on declarative memory systems and especially working memory and executive attention (Ashby et al., 1998; Maddox, Ashby, Ing, & Pickering, 2004; Waldron & Ashby, 2001; Zeithamova & Maddox, 2006). In II category-learning tasks, accuracy is maximized only if information from two or more stimulus components is integrated at some predecisional stage (Ashby & Gott, 1988). In many cases, the optimal strategy is difficult or impossible to describe verbally (Ashby et al., 1998). An example of an II task is shown in Figure 1. In this case, the two categories are each composed of circular sine-wave gratings that vary in the width and orientation of the dark and light bars. The diagonal line denotes the category boundary. Note that no simple verbal rule correctly separates the disks into the two categories. Nevertheless, many studies have shown that people reliably learn such categories, provided they receive consistent and immediate feedback after each response (for a review, see Ashby & Maddox, 2005). Evidence suggests that success in II tasks depends on procedural learning that is mediated largely within the striatum (Ashby & Ennis, 2006; Filoteo, Maddox, Salmon, & Song, 2005; Knowlton, Mangels, & Squire, 1996; Nomura et al., 2007). For example, one feature of traditional procedural-learning tasks is that switching the locations of the response keys interferes with performance (e.g., Willingham, Wells, Farrell, & Stemwedel, 2000). In agreement with this result,

There is now abundant evidence that human learning and memory are mediated by multiple systems (Eichenbaum & Cohen, 2001; Schacter, Wagner, & Buckner, 2000; Squire, 2004). In fact, the existence of multiple memory systems is now so widely accepted that some researchers have begun asking how these putative systems interact (Ashby & Crossley, 2010; Poldrack et al., 2001; Poldrack & Packard, 2003; Schroeder, Wingard, & Packard, 2002). This article investigates interactions between two learning systems—a prefrontal cortex– based declarative system that uses working memory and executive attention to learn via explicit reasoning and a striatal-based procedural system that depends on reinforcement learning. As we show, the available evidence is mixed with respect to the nature of interaction between these two systems. The behavioral paradigm we chose is perceptual category learning. This is appropriate because evidence suggests that human categorization is mediated by a number of functionally distinct category-learning systems (e.g., Ashby, Alfonso-Reese, Turken, & Waldron, 1998; Erickson & Kruschke, 1998; Love, Medin, & Gureckis, 2004; Reber, Gitelman, Parrish, & Mesulam, 2003) that map directly onto the major memory systems (Ashby & O’Brien,

This article was published Online First March 9, 2015. Matthew J. Crossley, Department of Psychology, University of California, Berkeley; F. Gregory Ashby, Department of Psychological & Brain Sciences, University of California, Santa Barbara. This research was supported in part by the U.S. Army Research Office through the Institute for Collaborative Biotechnologies under Grant W911NF-07-1– 0072 and by Air Force Office of Scientific Research Grant FA9550-12-1– 0355. We thank Jamie White for her help collecting data. Correspondence concerning this article should be addressed to Matthew J. Crossley, Department of Psychology, University of California, Berkeley, Berkeley, CA 94720. E-mail: [email protected] 1388

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

PROCEDURAL LEARNING

1389

Figure 1. Top: Example trials in a typical category-learning experiment. Bottom: Examples of informationintegration and rule-based categories. See the online article for the color version of this figure.

switching the locations of the response keys in the Figure 1 categorization tasks interferes with II performance but not with RB performance (Ashby, Ell, & Waldron, 2003; Maddox, Bohil, & Ing, 2004; Spiering & Ashby, 2008). The COVIS (competition between verbal and implicit systems) theory of category learning (Ashby et al., 1998) predicts that people begin II tasks by experimenting with simple explicit rules. The original version of the theory predicted that as learning progresses, people begin switching trial by trial between the explicit and procedural systems. However, Ashby and Crossley (2010) reported strong evidence against trial-by-trial switching. Their study used sine-wave gratings like those shown in Figure 1 and hybrid category structures in which optimal accuracy was possible if a simple rule was used when the bar orientation was steep and a procedural strategy was used when the orientation was shallow. Even though participants could easily learn either subtask, only three of 53 participants in several different experiments showed any evidence of trial-by-trial switching. Instead, almost all participants used either a simple suboptimal RB strategy or a suboptimal procedural strategy on all trials, with the former group of participants greatly outnumbering the latter. On the basis of these results, Ashby and Crossley (2010) proposed that use of explicit strategies inhibits use of the procedural system. This interpretation seemed consistent with neuroimaging evidence reporting an antagonistic relationship between neural

activation in the striatum and medial temporal lobes during category learning—that is, striatal activation tended to increase with category learning, whereas medial temporal lobe activation decreased (Moody, Bookheimer, Vanek, & Knowlton, 2004; Nomura et al., 2007; Poldrack et al., 2001; Poldrack, Prabhakaran, Seger, & Gabrieli, 1999). Similar results have been reported within the more general memory systems literature. For example, several functional MRI studies of skill learning have also reported negative correlations between medial temporal lobe and striatal activation (Dagher, Owen, Boecker, & Brooks, 2001; Jenkins, Brooks, Nixon, Frackowiak, & Passingham, 1994; Poldrack & Gabrieli, 2001). In addition, a number of animal lesion studies have reported that medial temporal lobe lesions can improve performance in striatal-dependent habit-learning tasks and, conversely, that striatal lesions can improve performance in medial temporal lobe– dependent spatial-learning tasks (e.g., Mitchell & Hall, 1988; O’Keefe & Nadel, 1978; Schroeder et al., 2002). However, in more recent neuroimaging work, Foerde, Knowlton, and Poldrack (2006) reported persistent striatal activation even during declarative control. Moreover, lesion studies show that behavior can be switched from goal directed to habitual and vice versa, suggesting that these two systems learn simultaneously (Balleine & Dickinson, 1998; Coutureau & Killcross, 2003; Killcross & Coutureau, 2003; Yin, Knowlton, & Balleine, 2004).

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1390

CROSSLEY AND ASHBY

Human neuroimaging and animal lesion studies suggest independent learning and are mostly agnostic with respect to control. Human behavioral studies suggest active inhibition of behavioral output but have little to say regarding learning. Thus, one natural speculation from this previous work is that the procedural system learns even when the declarative system controls behavior. Here, we explore this previously untested hypothesis. The results of two perceptual category-learning experiments are described that strongly suggest that procedural learning does occur while the explicit system is in control and that this learning might be just as good as if the procedural system was controlling the response. These results are consistent with the idea that declarative memory systems inhibit the ability of the procedural system to access motor output systems but do not prevent procedural learning.

Experiment 1 The design of Experiment 1 is illustrated in Figure 2. The experiment included four conditions, each composed of three phases. In the parse-congruent condition, all three phases used the same II categories, but in Phases 1 and 2, some stimuli were never shown. Specifically, during Phases 1 and 2, stimuli were selected from the two categories in such a way that a one-dimensional rule could achieve perfect accuracy. During Phase 1, a simple rule on bar width was optimal (thick vs. thin), whereas in Phase 2, the optimal rule depended only on bar orientation (steep vs. shallow). The final phase included all stimuli and therefore required a procedural strategy for optimal accuracy. Critically, however, all stimuli shown during Phase 3 had been previously shown at some

Figure 2. Experiment 1 design. Each condition consisted of two training phases in which participants were given information-integration (II) categories parsed in such a way as to allow perfect accuracy with a one-dimensional rule and a transfer phase in which participants were given the complete II categories. See the online article for the color version of this figure.

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

PROCEDURAL LEARNING

time during Phase 1 or 2 (or both), and no stimuli changed categories. Thus, if the procedural system was able to learn during Phases 1 and 2 when the explicit system was in control, then transfer performance during Phase 3 should be good. However, if the procedural system was unable to learn when the explicit system controlled responding, then Phase 3 transfer performance should be poor. The other three conditions of Experiment 1 served as various control conditions. The parse-rotated condition used exactly the same stimuli during Phases 1 and 2 as in the parse-congruent condition, but Phase 3 used categories that were rotated 90° counterclockwise. Thus, many stimuli changed category membership in Phase 3 relative to Phases 1 and 2, so procedural learning during Phases 1 and 2 should impair transfer performance. To see this, note that procedural category learning reflects learned mappings from regions of perceptual space to category responses and that half of the mappings learned during training will conflict with the mappings required for good transfer performance. The controlcongruent and control-rotated conditions trained participants on the full II categories from the parse-congruent condition. Thus, Phases 1 and 2 of these conditions were identical to the transfer phase of the parse-congruent condition. The control-congruent condition continued with these II categories during transfer (Phase 3), whereas the control-rotated condition switched to categories that were rotated 90° counterclockwise.

Method Participants. There were 19 participants in the parsecongruent condition, 18 in the parse-rotated condition, 21 in the control-congruent condition, and 17 in the control-rotated condition. All participants completed the study, had normal or corrected-to-normal vision, and received course credit for their participation. To ensure that only participants who performed well above chance were included in the transfer phase analysis, a learning criterion of 65% correct during each training phase was applied. Using this criterion, we excluded two participants from the parse-congruent condition, two from the parse-rotated condition, three from the control-congruent condition, and four from the control-rotated condition. Stimuli. Stimuli were circular sine-wave gratings that varied in bar width and bar orientation (like those illustrated in Figure 1) and were generated by drawing 800 random samples (x, y) from a (0 –100, 0 –100) bivariate uniform distribution along the two stimulus dimensions. Samples that satisfied x ⬎ y were labeled Category A exemplars, and samples that satisfied x ⬍ y were labeled Category B exemplars in every phase except Phase 3 of the parse-rotated and control-rotated conditions. Each random sample (x, y) was converted to a stimulus according to the nonlinear transformations defined by (Treutwein, Rentschler & Caelli, 1989), which roughly equate the salience of each dimension (see Appendix A for details). Optimal accuracy was 100%. All stimuli in each category were randomly sampled without replacement from the original random sample of 800 stimuli. This was done independently for each participant in each block. Procedures. All participants received eight training blocks of 75 trials each, followed by two transfer blocks of 100 trials each. Participants in the parse-congruent and the parse-rotated conditions received one-dimensional RB stimuli during all training

1391

blocks (Blocks 1– 8). One one-dimensional RB category was presented during the first four training blocks, and the other was presented during the final four training blocks (see Figure 2). The order in which the RB categories were presented was counterbalanced across participants. Participants in the parse-congruent condition were given the full underlying II categories, and participants in the parse-rotated condition were given the rotated II categories. Participants in the control-congruent condition received the full II categories for all training and transfer blocks. Participants in the control-rotated condition received identical training to the controlcongruent condition but received the full rotated categories during transfer. The sequence of events on each trial was constant across all trials in every condition in Experiment 1. A fixation cross was displayed at the center of the screen for 0.75 s. The fixation cross was then replaced with a response-terminated category stimulus. Participants generated a response by pressing the d key for Category A or the k key for Category B. Neither of these keys was given a special label, but the oral instructions informed participants of the category-label-to-button mappings. If any button other than one of these two was pressed, an “invalid key” message was sounded. If the participants failed to make a response after 5 s, the stimulus disappeared and a message was displayed: “Please respond faster.” Auditory feedback was presented immediately after the response. Correct responses were indicated by a pure sine tone (500 Hz, 0.73 s in duration), and incorrect feedback was indicated by a sawtooth tone (200 Hz, 1.22 s in duration). A 1-s blank screen intertrial interval immediately followed auditory feedback. Participants in each condition were told that they were to categorize stimuli on the basis of their spatial frequency and orientation, that there were two equally likely categories, and that perfect accuracy could be achieved. They were also told that the category structure might or might not change in between blocks. Finally, they were asked to write down the strategy they used to classify the stimuli at the end of each training block (but not during the transfer blocks).

Results Accuracy-based results. Mean accuracies in each condition broken up into 25-trial blocks are shown in Figure 3. Note that there were no obvious differences between the parse-congruent and the parse-rotated conditions during either of the two training phases, and there were no obvious differences between the controlcongruent and the control-rotated conditions during these phases. Both the parse-congruent and parse-rotated conditions showed much higher accuracy during the training phase than did the control-congruent and control-rotated conditions. Importantly, during transfer, the parse-rotated condition was impaired relative to the parse-congruent condition, and the control-rotated condition was impaired relative to the control-congruent condition. The magnitude of these impairments was nearly identical. To test these conclusions formally, we performed a mixeddesign, repeated-measures analysis of variance (ANOVA) with condition (parse-congruent, parse-rotated, control-congruent, control-rotated) and phase as factors. The ANOVA assumed Type 3 sums of squares and used the Satterthwaite approximation for degrees of freedom. We also estimated marginal population means from the same general linear model used in the ANOVA to

1392

CROSSLEY AND ASHBY

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

conditions. The main effect of phase reflected the transfer impairment in the parse conditions. The Condition ⫻ Phase interaction reflected the greater transfer costs in the rotated conditions relative to the congruent conditions. The differences between the two congruent conditions and between the two rotated conditions were not significant. See Figure 4 for an illustration of these differences. Model-based results. Optimal performance is 100% correct and required a procedural strategy on the full II categories that were used—for example, in the transfer phase of each condition. However, our participants did not perform optimally and, in fact, only reached accuracy levels that could be achieved by applying certain verbalizable rules. Thus, our analyses so far cannot rule out the possibility that transfer performance in Experiment 1 reflected rule use as opposed to procedural learning acquired during train-

Table 1 Experiment 1 Statistics Factor

Result

Mixed-design, repeated-measures analysis of variance Condition F(3, 94) ⴝ 28.48, p < .001 Phase F(2, 1,789) ⴝ 162.17, p < .001 Condition ⫻ Phase interaction F(6, 1,789) ⴝ 31.36, p < .001 Condition

Result Training Phase 1a

Parse-congruent–parse-rotated Parse-congruent–control-rotated Parse-congruent–control-congruent Parse-rotated–control-rotated Parse-rotated–control-congruent Control-congruent–control-rotated

t(200) ⫽ ⫺0.62, p ⫽ .53 t(168) ⴝ 8.28, p < .001 t(91) ⴝ 6.36, p < .001 t(694) ⴝ 10.34, p < .001 t(102) ⴝ 6.98, p < .001 t(106) ⫽ 1.08, p ⫽ .28

Training Phase 2a Parse-congruent–parse-rotated t(200) ⫽ ⫺0.03, p ⫽ .98 Parse-congruent–control-rotated t(168) ⴝ 7.65, p < .001 Parse-congruent–control-congruent t(91) ⴝ 6.40, p < .001 Parse-rotated–control-rotated t(694) ⴝ 8.96, p < .001 Parse-rotated–control-congruent t(102) ⴝ 6.52, p < .001 Control-congruent–control-rotated t(106) ⫽ 0.50, p ⫽ .62 Pooled training phasea

Figure 3. Experiment 1 accuracy. Each block included 25 trials, and bands represent standard errors of the mean. Blocks 1–12 constituted the first training phase, Blocks 13–24 constituted the second training phase, and Blocks 25–32 constituted the transfer phase. Vertical dashed lines partition separate phases. See the online article for the color version of this figure.

compare pairwise differences between levels of each factor. All pairwise comparisons were adjusted for multiple comparisons using Tukey’s method with an experimentwise error rate set to 0.05. Detailed results are shown in Table 1. All effects in the ANOVA were significant. The main effect of condition reflected the greater training phase accuracy in the parse

Parse-congruent–parse-rotated Parse-congruent–control-rotated Parse-congruent–control-congruent Parse-rotated–control-rotated Parse-rotated–control-congruent Control-congruent–control-rotated

t(123) ⫽ ⫺0.37, p ⫽ .71 t(106) ⴝ 8.95, p < .001 t(61) ⴝ 7.04, p < .001 t(399) ⴝ 11.46, p < .001 t(67) ⴝ 7.49, p < .001 t(69) ⫽ 0.88, p ⫽ .31

Transfer phaseb Parse-congruent–parse-rotated Parse-congruent–control-rotated Parse-congruent–control-congruent Parse-rotated–control-rotated Parse-rotated–control-congruent Control-congruent–control-rotated

t(291) t(240) t(126) t(963) t(144) t(149)

ⴝ ⴝ ⫽ ⫽ ⴝ ⴝ

4.53, p 3.50, p ⫺0.08, ⫺0.75, ⴚ3.94, 3.18, p

< .001 < .001 p ⫽ .93 p ⫽ .93 p < .001 < .01

Note. The F statistics were calculated on the basis of Satterthwaite’s approximation for denominator degrees of freedom, rounded down to the nearest integer. Significant results are in boldface. a Parse conditions exhibited greater accuracy than control conditions. b Rotated conditions were impaired relative to congruent conditions.

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

PROCEDURAL LEARNING

ing. For example, consider what would happen if participants perseverated during the transfer phase with the rules they used during training. Note that participants classifying stimuli according to the bar-thickness rule used during training would achieve 75% correct in both the rotated and control conditions. However, participants classifying stimuli according to the orientation rule used during training would achieve 75% correct in the control conditions but only 25% correct in the rotated condition. It seems unlikely that this would be the case, because any participant using a rule that yielded only 25% accuracy could easily invert their rule to obtain 75% correct. Nevertheless, it is possible that the impairment in the rotated conditions stemmed from rule perseveration. To examine this possibility, we estimated each participant’s strategy by partitioning their transfer phase data into blocks of 100 trials and fitting different types of decision-bound models (e.g., Ashby, Waldron, Lee, & Berkman, 2001; Maddox & Ashby, 1993) to each block of data from every participant.1 One type of model assumed an RB decision strategy that was consistent with either a horizontal or a vertical decision bound. A vertical bound, for example, is consistent with the explicit rule to respond A if the bars are thick and B if they are thin. Another type of model assumed an II strategy that was equivalent to a linear classifier with arbitrary slope and intercept, and a third type assumed random guessing (see Appendix B for details). Note that the II models were a simple proxy used to identify patterns of responses most consistent with optimal performance and are not meant as process models of categorization. The numbers of participants in each condition best fit by a model of these three types during every block of the experiment are shown in Table 2. There was no significant difference between the proportion of participants best fit by RB models in the parse-congruent and control-congruent conditions during the first 100 trials of transfer, t(38) ⫽ 0.00, p ⫽ .50, or during the last 100 trials of transfer, t(38) ⫽ ⫺0.80, p ⫽ .21. These differences were also nonsignificant in the parse-rotated and control-rotated conditions, first 100 trials: t(33) ⫽ ⫺1.36, p ⫽ 0.09; second 100 trials: t(33) ⫽ ⫺1.02, p ⫽ 0.16. This would seem to argue against rule perseveration as an explanation for the impairment observed in the rotated conditions (i.e., because there were not proportionally more best fitting RB models in the parse-rotated condition than there were in the control-rotated condition). Unfortunately, this inference cannot be made in confidence because there was an unusually high proportion of participants best fit by RB models in the controlrotated condition. For example, the proportion of participants best fit by an RB model during the last 100 trials of training was significantly higher in the control-rotated condition than in the control-congruent condition, t(36) ⫽ ⫺3.17, p ⬍ .005. In fact, the proportion of RB users during the last training block in the control-rotated condition was at the level of that observed in the parse-rotated condition, t(33) ⫽ 0.34, p ⫽ .37. Thus, the exact role of RB strategies in generating our results is unfortunately difficult to assess. Strategy reports. Intuition suggests that participants in the parse conditions would have adopted a declarative strategy during training, because simple one-dimensional rules yield perfect accuracy. Nevertheless, it is difficult to completely rule out the possibility that participants might have adopted a procedural strategy during the training phase. To help with this concern, we asked all participants in Experiment 1 to write down their response strategy

1393

between training blocks. Every written report described a onedimensional rule. We can also appeal to decision-bound models to help assess whether participants used declarative strategies during training. Table 2 shows that nearly twice as many participants were best fit by an RB strategy than were best fit by an II strategy during the last block of training in the parse-congruent and parse-rotated conditions. These RB models fit very well, accounting for 91% of all responses in the parse-congruent condition and 95% in the parse-rotated condition (as shown in Table 2). Further, the parameters of the II models that best fit the remaining third of our participants described linear classification strategies that only deviated from a one-dimensional RB strategy on a few trials (not shown in Table 2).

Discussion Phase 3 transfer performance was significantly better in the congruent conditions than in the rotated conditions. These results support the hypothesis that the procedural system learned during training even though the explicit system controlled responding. In fact, because the impairment in the parsed-rotated condition was not significantly different from the impairment in the controlrotated condition, our results suggest that procedural learning was just as good as it is when the procedural system controls responding. However, there is another interpretation of the Experiment 1 results. It is logically possible that participants in the parsecongruent condition performed well during Phase 3 transfer by persisting with the rules that they learned during Phases 1 and 2. Thus, according to this interpretation, parse-congruent participants never used procedural strategies, in which case Experiment 1 says nothing about whether procedural learning occurs while the explicit system controls responding. We believe this possibility is unlikely because rules are usually applied flexibly, and any participant using a rule that performed at well below chance level ought to be able to easily reverse his or her strategy. Even so, the results of Experiment 1 cannot rule out this possibility. Thus, we designed Experiment 2 to test between these two interpretations of Experiment 1.

Experiment 2 If the successful transfer observed in the parse-congruent condition and the interference observed in the parse-rotated condition of Experiment 1 were because of procedural learning that occurred while participants were using rules during Phases 1 and 2, then applying a manipulation known to interfere with procedural but not RB learning during these phases should reduce transfer performance. However, if participants used rules throughout the experiment, then such a manipulation should not affect transfer performance. Experiment 2 followed this logic. The manipulation we used that is known to interfere with learning in II but not RB tasks was to delay feedback by a few seconds (Maddox, Ashby, & Bohil, 2003; Worthy, Markman, & Maddox, 2013). 1 We choose to fit the models to blocks of 100 trials instead of the blocks of 25 trials used for the accuracy analyses because the reliabilities of the fits were greatly improved by a larger sample size.

1394

CROSSLEY AND ASHBY

Thus, Experiment 2 replicated the parse-congruent and parserotated conditions of Experiment 1 with and without a feedback delay during Phases 1 and 2.

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Method Participants. There were 26 participants in the congruentdelay condition, 23 in the rotated-delay condition, 15 in the congruent-control condition, and 17 in the rotated-control condition. All participants completed the study and received course credit for their participation. All participants had normal or corrected-to-normal vision. To ensure that only participants who performed well above chance were included in the transfer phase analysis, a learning criterion of 65% correct during each training phase was applied. Using this criterion, we excluded six participants from the congruent-delay condition, five from the rotateddelay condition, two from the congruent-control condition, and none from the rotated-control condition. Stimuli. Stimuli were identical to those used in Experiment 1, with the exception that there were 600 exemplars instead of 800. Participants were shown stimulus masks in the intervals between response and feedback. We generated these masks by computing the inverse Fourier transform of each of the 600 category stimuli. Masks were randomly sampled without replacement on each training phase trial. Procedures. All participants received 12 blocks of 50 trials each for a total of 600 trials. The first four of these blocks constituted the first training phase, the second four constituted the second training phase, and the last four constituted the transfer phase. Participants in all conditions received one-dimensional RB stimuli during all training blocks (Blocks 1– 8). One onedimensional rule was presented during the first four training blocks (Blocks 1– 4), and the other was presented during the final four training blocks (Blocks 5– 8). The order in which the RB categories were presented was counterbalanced across participants. Participants in the congruent-delay and congruent-control conditions were given the full underlying II categories during transfer, and participants in the rotated-delay and rotated-control condition were given the rotated II categories during transfer. See the top panels of Figure 5 for an illustration of this design. The sequence of events within a training phase trial is illustrated in the bottom left panel of Figure 5, and the sequence of events for a transfer phase trial is illustrated in the bottom right panel. Instructions were identical to those given for Experiment 1, with the addition that participants were told that they would see a noise pattern in between response and feedback and that this pattern had no relevant information. They were also told that it would disap-

Figure 4. Effects of parsed training on transfer accuracy. Parse data points are the mean accuracy differences between the parse-congruent and the parse-rotated conditions. Control data points are the mean accuracy differences between the controlcongruent and the control-rotated conditions. Parse-control data points are the mean accuracy differences between the parse and control differences. Error bars represent 95% confidence intervals.

PROCEDURAL LEARNING

1395

Table 2 Experiment 1 Decision-Bound Model Fit Summary

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Parse-congruent Phase

Block

1 1 1 2 2 2 3 3

1 2 3 4 5 6 7 8

1 1 1 2 2 2 3 3

1 2 3 4 5 6 7 8

II

5 5 3 14 6 6 14 15

RB

II

RB

Control-congruent Guess

Numbers of participants best fit by a model 1 8 8 2 0 3 14 1 0 3 14 1 0 5 12 1 2 6 10 2 1 6 10 2 1 9 6 3 2 7 7 4

13 14 16 5 11 12 4 2

.81 .84 .91 .86 .88 .89 .80 .79

Parse-rotated Guess

.86 .88 .90 .79 .93 .91 .80 .79

II

RB

Control-rotated

Guess

of a selected type 9 10 14 6 11 9 14 5 16 4 16 3 13 6 14 6

Proportions of responses accounted for by the best fitting models .86 .89 .75 .82 .90 .89 .80 .77 .86 .92 .81 .79 .89 .86 .80 .80 .87 .92 .80 .79 .87 .95 .83 .77 .72 .76 .80 .77 .79 .76 .80 .70

2 1 1 2 1 2 2 1

II

3 8 8 8 7 9 6 7

.81 .79 .82 .82 .80 .82 .78 .79

RB

12 9 9 8 9 8 9 9

Guess

2 0 0 1 1 0 2 1

.74 .80 .81 .83 .79 .80 .76 .73

Note. II ⫽ information integration; RB ⫽ rule based.

pear near the end of the experiment and that this also carried no relevant information.

Results Accuracy-based results. Mean accuracies in each condition broken up into 25-trial blocks are shown in Figure 6. Note that there may have been differences between the congruent-delay and rotated-delay conditions during each of the two training phases but that there were no obvious differences between the congruentcontrol and the rotated-control conditions during any of the phases. There appears to have been a slight impairment in the rotateddelay condition relative to the congruent-delay condition, but there was a much larger impairment in the rotated-control condition than in the congruent-control condition. To test these conclusions formally, we performed a mixeddesign, repeated-measures ANOVA with condition (congruentdelay, rotated-delay, congruent-control, rotated-control) and phase as factors. The ANOVA assumed Type 3 sums of squares and used the Satterthwaite approximation for degrees of freedom. We also estimated marginal population means from the same general linear model used in the ANOVA to compare pairwise differences between levels of each factor. All pairwise comparisons were adjusted for multiple comparisons using Tukey’s method with an experimentwise error rate set to 0.05. Detailed results are shown in Table 3. All effects in the ANOVA were significant and were driven by a number of differences between the conditions during all phases of the experiment. The important points are that (a) there were no accuracy differences between the congruent-delay and rotateddelay conditions or between the congruent-control and rotatedcontrol conditions when the data were pooled across both training phases, and (b) rotated-control accuracy was impaired relative to congruent-control accuracy, but this impairment disappeared in the

delay conditions. Moreover, the difference in accuracy during the transfer phase between the delay conditions (congruent-delay– rotated-delay) was significantly smaller than the difference in accuracy between the control conditions (congruent-control– rotated-control). See Figure 7 for an illustration of these differences. Model-based results. We partitioned the transfer phase data from each participant into blocks of 100 trials and fit the same decision-bound models as in Experiment 1 to each block of data from every participant. The numbers of participants in each condition best fit by a model of the three types during the transfer phase are shown in Table 4. There was no significant difference between the congruent-delay and rotated-delay conditions in the number of participants best fit by a model that assumed a procedural strategy during the first 100 trials of transfer, t(36) ⫽ 0.33, p ⫽ .37, or during the last 100 trials of transfer, t(36) ⫽ 0.66, p ⫽ .26. The difference between the congruent-control and rotated-control conditions was significant for both the first 100 transfer trials, t(28) ⫽ 2.19, p ⬍ .05, and the last 100 transfer trials, t(28) ⫽ 4.11, p ⬍ .001.

Discussion Previous research has shown that a feedback delay of even a few seconds impairs II category learning but not RB learning (Maddox et al., 2003; Worthy et al., 2013). Thus, if participants in the parse-congruent condition of Experiment 1 used explicit rules during transfer, then a feedback delay during training should have no effect on their transfer performance. Conversely, if the good transfer performance of this group in Experiment 1 was because of procedural learning that occurred during the RB training, then a feedback delay during RB training should impair transfer performance. The results of Experiment 2 strongly supported this latter prediction. Specif-

CROSSLEY AND ASHBY

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1396

Figure 5. Experiment 2 design. Top: Each condition in Experiment 2 contained two training phases and a transfer phase in which the underlying categories matched those from Experiment 1. Bottom-left: Within-trial event timing differed between the delay conditions and the control conditions. Delay conditions contained a long delay between response and feedback and a short intertrial interval (ITI). Control conditions included a short delay and a long ITI. Total trial length was equated between conditions. Bottom-right: Within-trial event timing changed to that used in Experiment 1 during the transfer phase. See the online article for the color version of this figure.

ically, we found that there was a significantly larger difference in transfer phase accuracy between the control-congruent and control-rotated conditions than between the delay-congruent and delay-rotated conditions. This interpretation was reinforced by model-based analyses that showed that the number of participants best fit by an II model was not significantly different between the two delay conditions but was significantly different between the two control conditions.

General Discussion The results from Experiments 1 and 2 suggest that procedural learning occurs during declarative control. This result resonates with intuitive notions of how a variety of motor and cognitive skills are learned, and it adds a critical clue to a literature that is somewhat mixed on the subject. Neuroimaging evidence has reli-

ably found an antagonistic relationship between neural substrates for procedural and declarative memory (Poldrack et al., 2001; Poldrack & Packard, 2003; Schroeder et al., 2002), although, more recently, Foerde et al. (2006) reported persistent striatal activation even during declarative control. All previous behavioral studies investigating interactions between procedural and declarative learning have found an antagonistic relationship (Ashby & Crossley, 2010). The present results suggest that if there is inhibition between procedural and declarative systems, it may operate only at the level of expression and not at the level of learning. Evidence suggests that the sensorimotor basal ganglia—and the dorsolateral striatum in particular—form the basis for procedural memory, whereas declarative memory is most often associated with prefrontal cortex and medial temporal lobe structures (Ashby & Ennis, 2006; Fletcher, Shallice, Frith, Frackowiak, & Dolan,

PROCEDURAL LEARNING

1397

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

glutamate projections directly to the internal segment of the globus pallidus (Joel & Weiner, 1997; Parent & Hazrati, 1995). This extra excitatory input to the globus pallidus tends to offset inhibitory input from the striatum, making it more difficult for striatal activity to affect cortex. Hence, the hyperdirect pathway could permit (by reducing subthalamic activity) or prevent (by increasing subthalamic activity) signals coming from the striatum from influencing cortex. The hyperdirect pathway could prevent the procedural system from accessing motor output systems in cortex, but it does not directly interact with the striatum (i.e., the presumed site of learning in the procedural system), and, therefore, it does not exclude the possibility of learning in the procedural system while

Table 3 Experiment 2 Statistics Factor

Result

Mixed-design, repeated-measures analysis of variance Condition F(3, 80) ⴝ 4.29, p < .01 Phase F(2, 1,551) ⴝ 258.81, p < .001 Condition ⫻ Phase interaction F(6, 1,551) ⴝ 7.72, p < .001 Condition

Result Training Phase 1a

Congruent-delay–rotated-delay Congruent-delay–congruent-control Congruent-delay–rotated-control Rotated-delay–congruent-control Rotated-delay–rotated-control Congruent-control–rotated-control

t(146) t(143) t(135) t(252) t(233) t(169)

ⴝ ⫽ ⫽ ⴝ ⴝ ⫽

2.82, p ⫺0.52, ⫺0.74, ⴚ3.20, ⴚ3.61, ⫺0.16,

< .01 p ⫽ .604 p ⫽ .462 p < .01 p < .001 p ⫽ .869

Training Phase 2b Congruent-delay–rotated-delay t(146) ⫽ ⫺0.77, p ⫽ .444 Congruent-delay–congruent-control t(143) ⫽ ⫺1.77, p ⫽ .079 Congruent-delay–rotated-control t(135) ⴝ ⴚ2.49, p < .05 Rotated-delay–congruent-control t(252) ⫽ ⫺1.11, p ⫽ .267 Rotated-delay–rotated-control t(233) ⫽ ⫺1.80, p ⫽ .074 Congruent-control–rotated-control t(169) ⫽ ⫺0.54, p ⫽ .592 Pooled training phasec

Figure 6. Experiment 2 accuracy. Each block included 25 trials, and bands represent standard errors of the mean. Blocks 1– 8 constituted the first training phase, Blocks 9 –16 constituted the second training phase, and Blocks 17–24 constituted the transfer phase. Vertical dashed lines partition separate phases. See the online article for the color version of this figure.

1998; Mishkin, Malamut & Bachevalier, 1984; Tulving & Markowitsch, 1998; Willingham, 1998). There are many possible anatomical projections that could mediate the interaction between these networks. Ashby and Crossley (2010) suggested that the hyperdirect pathway through the basal ganglia might mediate system interactions. The hyperdirect pathway begins with direct excitatory glutamate projections from frontal cortex to the subthalamic nucleus. The subthalamic nucleus then sends excitatory

Congruent-delay–rotated-delay Congruent-delay–congruent-control Congruent-delay–rotated-control Rotated-delay–congruent-control Rotated-delay–rotated-control Congruent-control–rotated-control

t(86) ⫽ 1.17, p ⫽ .25 t(84) ⫽ ⫺1.31, p ⫽ .19 t(80) ⫽ ⫺1.83, p ⫽ .07 t(139) ⴝ ⴚ2.51, p < .05 t(130) ⴝ ⴚ3.12, p < .01 t(90) ⫽ ⫺0.39, p ⫽ .70

Transfer phased Congruent-delay–rotated-delay Congruent-delay–congruent-control Congruent-delay–rotated-control Rotated-delay–congruent-control Rotated-delay–rotated-control Congruent-control–rotated-control

t(146) t(143) t(135) t(252) t(233) t(169)

⫽ ⴝ ⴝ ⴝ ⫽ ⴝ

1.82, p ⴚ2.28, 2.09, p ⴚ4.09, 0.33, p 4.16, p

⫽ .071 p < .05 < .05 p < .001 ⫽ .741 < .001

Note. The F statistics were calculated on the basis of Satterthwaite’s approximation for denominator degrees of freedom, rounded down to the nearest integer. Significant results are in boldface. a Rotated-delay accuracy was below that for all other conditions. b Congruent-delay accuracy was below rotated-control accuracy. c There were no accuracy differences between congruent-delay and rotated-delay or between congruent-control and rotated-control. d Rotated-control accuracy was impaired relative to congruent-control accuracy, but this impairment disappeared in the delay conditions.

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1398

CROSSLEY AND ASHBY

declarative systems control responding. This possibility is not without significant challenges, but we leave a full discussion of these factors to future work. Our results speak to a broad literature spanning multiple behavioral paradigms that are more generally interested in untangling the contributions of multiple systems to behavior. Essentially, all current formal models of multiple systems assume that the systems learn independently. This is true across a variety of behavioral paradigms, including reinforcement learning, instrumental conditioning, sequence learning, and motor learning. We briefly review some of the most influential of these here. Multiple-systems accounts of instrumental conditioning dissociate habitual from goal-directed behaviors, which are distinguished from each other according to their sensitivity to outcome devaluation and changing response-outcome contingencies (Yin & Knowlton, 2006). Specifically, a behavior is considered goal directed if the rate or likelihood of the behavior is decreased by reductions in the expected value of the outcome and by reductions in the contingency between the action and the outcome. In contrast, habits are behaviors that have become insensitive to reductions in both outcome value and response-outcome contingency (Dickinson, 1985; Yin, Ostlund, & Balleine, 2008). Goal-directed behaviors require dorsal-medial striatal networks, and habitual behaviors require dorsal-lateral striatal networks (Yin & Knowlton, 2004; Yin, Knowlton, & Balleine, 2005; Yin et al., 2008). Lesion studies show that behavior can be switched from goal directed to habitual and vice versa, suggesting that these two systems learn independently (Balleine & Dickinson, 1998; Coutureau & Killcross, 2003; Killcross & Coutureau, 2003; Yin et al., 2004). Daw, Niv, and Dayan (2005) synthesized these behavioral results into a theory based in the machine learning language of reinforcement learning (RL), which assumes that a model-based controller (the RL counterpart to goal-directed behavior) competes for control of behavior with a model-free controller (the RL counterpart of habitual behavior). Importantly, Daw et al. assumed that these two controllers learn independently. A multiple-systems account similar to the Daw et al. (2005) model couched in model-based and model-free RL has been widely used in recent human work. The take-home message is that this work supports the notion that there are multiple systems that learn via something like model-based and model-free RL and that have at least partially distinct neural substrates. Even so, isolating one system from the other has proven challenging. For example, model-based RL signatures seem to appear in many of the same regions where the brain processes reward information, including areas classically thought to reflect exclusively model-free processing (e.g., the ventral striatum; Doll, Simon, & Daw, 2012). No behavioral study in humans has explored whether the independent

Figure 7. Effects of delayed feedback on transfer accuracy. Delay data points are the mean accuracy differences between the congruent-delay and the rotated-delay conditions. Control data points are the mean accuracy differences between the congruent-control and the rotated-control conditions. Delaycontrol data points are the mean accuracy differences between the delay and control differences. Error bars represent 95% confidence intervals.

PROCEDURAL LEARNING

1399

Table 4 Experiment 2 Decision-Bound Model Fit Summary

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Delay-congruent Phase

Block

1 1 2 2 3 3

1 2 3 4 5 6

1 1 2 2 3 3

1 2 3 4 5 6

II

10 7 14 7 14 13

.82 .85 .87 .86 .79 .79

RB

Guess

Delay-rotated II

RB

Control-congruent Guess

II

RB

Numbers of participants best fit by a model of a selected type 2 9 10 4 3 12 0 4 16 3 3 12 0 13 8 2 8 7 1 6 17 0 5 9 2 14 9 0 8 7 5 9 10 4 12 2

14 19 12 18 10 8

.86 .93 .86 .91 .75 .76

Control-rotated

Guess

0 0 0 1 0 1

Proportions of responses accounted for by the best fitting models .78 .81 .89 .87 .89 .92 .97 .91 .84 .91 .86 .88 .86 .87 .90 .94 .77 .72 .81 .78 .80 .80 .83 .84

II

5 4 9 5 6 5

RB

12 13 7 12 9 8

.85 .86 .88 .90 .76 .84

Guess

0 0 1 0 2 4

.85 .89 .88 .92 .75 .80

Note. II ⫽ information integration; RB ⫽ rule based.

learning assumption of Daw et al. holds for human behavior. It is tempting to think—at the very least on the basis that procedural memory is closely tied to dopamine—that procedural learning is mediated by model-free RL and that declarative memory– based learning is mediated by model-based RL. Although such a mapping almost certainly oversimplifies things, it is probably not entirely wrong. As preliminary evidence along these lines, Otto, Gershman, Markman, and Daw (2013) recently showed that the degree to which participants appear to respond in a model-based fashion can be up- or down-regulated by applying an explicit dual task. This result echoes classic results dissociating II and RB category learning and lends credence to the procedural/declarative, model-free/model-based mapping. Theoretical accounts of sequence learning have also assumed multiple systems. In particular, Keele, Ivry, Mayr, Hazeltine, and Heuer (2003) proposed that sequence learning is controlled by the interplay of separate unidimensional and multidimensional systems. The unidimensional system is entirely implicit and is driven by perceptually raw stimulus or response features. The neurobiological underpinnings of this system lie mostly in dorsal regions, including supplementary motor, primary motor, and parietal cortices. The multidimensional system can be explicit or implicit and is driven by perceptually abstract stimulus or response features. The neurobiological underpinnings of this system lie mostly in ventral regions, including occipital, temporal, prefrontal, and lateral premotor cortex. Importantly, this account of sequence learning assumes that the systems compete for control of behavior but learn independently. This theory was not expressed explicitly in terms of interactions between procedural and declarative memory systems, but the behavioral sensitivities and neurobiological underpinnings of the model nevertheless make it clear that such a comparison is not unreasonable. Importantly, learning within each system again occurs independently. Finally, even in simple motor-learning tasks, such as visuomotor adaptation, learning has been described as arising from the interaction of multiple systems. For example, Taylor and Ivry (2012) recently proposed that learning in such tasks is mediated by two

processes: an implicit cerebellar-based process that reflects the adaptation of an internal model by a sensory prediction error signal (i.e., the difference between where a person plans to reach and the feedback he or she receives) and a separate prefrontal-based process that reflects aiming strategies and learns from a distinct error signal (i.e., the difference between the feedback a person receives and his or her outcome goal). As in all previously described theories, these two processes operate and learn independently. This article has shown that procedural learning occurs during declarative control. A natural follow-up question is this: What exactly does the procedural system learn when the declarative system is in control of the response? For instance, does the procedural system learn independently of the pattern of responses (and, therefore, feedback) obtained by the controlling declarative system? Or, rather, does what the procedural system learns somehow depend on the declarative system’s responses? As we have just reviewed, the classic assumption underlying the vast majority of multiple-systems descriptions of behavior is one of independent learning. However, we know of no evidence that directly supports this assumption and note that at least some recent work has suggested that the independence assumption may be wrong (Doll, Hutchison, & Frank, 2011; Paul & Ashby, 2013). The present experiments do not speak directly to this important question, and so we leave further consideration to future work.

References Akaike, H. (1974). A new look at the statistical model identification. Automatic Control, IEEE Transactions on Automatic Control, 19, 716 – 723. Ashby, F. G. (1992). Multidimensional models of categorization. In F. G. Ashby (Ed.), Multidimensional models of perception and cognition (pp. 449 – 483). Hillsdale, NJ: Erlbaum, Inc. Ashby, F. G., Alfonso-Reese, L. A., Turken, A. U., & Waldron, E. M. (1998). A neuropsychological theory of multiple systems in category learning. Psychological Review, 105, 442– 481. http://dx.doi.org/ 10.1037/0033-295X.105.3.442

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

1400

CROSSLEY AND ASHBY

Ashby, F. G., & Crossley, M. J. (2010). Interactions between declarative and procedural-learning categorization systems. Neurobiology of Learning and Memory, 94, 1–12. http://dx.doi.org/10.1016/j.nlm.2010.03.001 Ashby, F. G., & Crossley, M. J. (2012). Automaticity and multiple memory systems. Wiley Interdisciplinary Reviews: Cognitive Science, 3, 363– 376. http://dx.doi.org/10.1002/wcs.1172 Ashby, F. G., Ell, S. W., & Waldron, E. M. (2003). Procedural learning in perceptual categorization. Memory & Cognition, 31, 1114 –1125. http:// dx.doi.org/10.3758/BF03196132 Ashby, F. G., & Ennis, J. M. (2006). The role of the basal ganglia in category learning. Psychology of Learning and Motivation, 46, 1–36. http://dx.doi.org/10.1016/S0079-7421(06)46001-1 Ashby, F. G., & Gott, R. E. (1988). Decision rules in the perception and categorization of multidimensional stimuli. Journal of Experimental Psychology: Learning, Memory, and Cognition, 14, 33–53. http://dx.doi .org/10.1037/0278-7393.14.1.33 Ashby, F. G., & Maddox, W. T. (2005). Human category learning. Annual Review of Psychology, 56, 149 –178. http://dx.doi.org/10.1146/annurev .psych.56.091103.070217 Ashby, F. G., & Maddox, W. T. (2011). Human category learning 2.0. In M. B. Miller & A. Kingstone (Eds.), The year in cognitive neuroscience: Annals of the New York Academy of Sciences (Vol. 1224, pp. 147–161). New York: New York Academy of Sciences. http://dx.doi.org/10.1111/ j.1749-6632.2010.05874.x Ashby, F. G., & O’Brien, J. B. (2005). Category learning and multiple memory systems. Trends in Cognitive Sciences, 9, 83– 89. http://dx.doi .org/10.1016/j.tics.2004.12.003 Ashby, F. G., Waldron, E. M., Lee, W. W., & Berkman, A. (2001). Suboptimality in human categorization and identification. Journal of Experimental Psychology: General, 130, 77–96. http://dx.doi.org/ 10.1037/0096-3445.130.1.77 Balleine, B. W., & Dickinson, A. (1998). Goal-directed instrumental action: Contingency and incentive learning and their cortical substrates. Neuropharmacology, 37, 407– 419. http://dx.doi.org/10.1016/S00283908(98)00033-1 Coutureau, E., & Killcross, S. (2003). Inactivation of the infralimbic prefrontal cortex reinstates goal-directed responding in overtrained rats. Behavioural Brain Research, 146, 167–174. http://dx.doi.org/10.1016/j .bbr.2003.09.025 Dagher, A., Owen, A. M., Boecker, H., & Brooks, D. J. (2001). The role of the striatum and hippocampus in planning: A PET activation study in Parkinson’s disease. Brain, 124, 1020 –1032. http://dx.doi.org/10.1093/ brain/124.5.1020 Daw, N. D., Niv, Y., & Dayan, P. (2005). Uncertainty-based competition between prefrontal and dorsolateral striatal systems for behavioral control. Nature Neuroscience, 8, 1704 –1711. http://dx.doi.org/10.1038/ nn1560 Dickinson, A. (1985). Actions and habits: The development of behavioural autonomy. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 308, 67–78. http://dx.doi.org/10.1098/rstb .1985.0010 Doll, B. B., Hutchison, K. E., & Frank, M. J. (2011). Dopaminergic genes predict individual differences in susceptibility to confirmation bias. Journal of Neuroscience, 31, 6188 – 6198. http://dx.doi.org/10.1523/ JNEUROSCI.6486-10.2011 Doll, B. B., Simon, D. A., & Daw, N. D. (2012). The ubiquity of model-based reinforcement learning. Current Opinion in Neurobiology, 22, 1075–1081. http://dx.doi.org/10.1016/j.conb.2012.08.003 Eichenbaum, H., & Cohen, N. J. (2001). From conditioning to conscious recollection: Memory systems of the brain. New York: Oxford University Press. Erickson, M. A., & Kruschke, J. K. (1998). Rules and exemplars in category learning. Journal of Experimental Psychology: General, 127, 107–140. http://dx.doi.org/10.1037/0096-3445.127.2.107

Filoteo, J. V., Maddox, W. T., Salmon, D. P., & Song, D. D. (2005). Information-integration category learning in patients with striatal dysfunction. Neuropsychology, 19, 212–222. http://dx.doi.org/10.1037/ 0894-4105.19.2.212 Fletcher, P. C., Shallice, T., Frith, C. D., Frackowiak, R. S., & Dolan, R. J. (1998). The functional roles of prefrontal cortex in episodic memory: II. Retrieval. Brain, 121, 1249 –1256. http://dx.doi.org/10.1093/brain/121.7 .1249 Foerde, K., Knowlton, B. J., & Poldrack, R. A. (2006). Modulation of competing memory systems by distraction. Proceedings of the National Academy of Sciences of the United States of America, 103, 11778 – 11783. http://dx.doi.org/10.1073/pnas.0602659103 Jenkins, I. H., Brooks, D. J., Nixon, P. D., Frackowiak, R. S. J., & Passingham, R. E. (1994). Motor sequence learning: A study with positron emission tomography. Journal of Neuroscience, 14, 3775– 3790. Joel, D., & Weiner, I. (1997). The connections of the primate subthalamic nucleus: Indirect pathways and the open-interconnected scheme of basal ganglia-thalamocortical circuitry. Brain Research Brain Research Reviews, 23, 62–78. http://dx.doi.org/10.1016/S0165-0173(96)00018-5 Keele, S. W., Ivry, R., Mayr, U., Hazeltine, E., & Heuer, H. (2003). The cognitive and neural architecture of sequence representation. Psychological Review, 110, 316 –339. http://dx.doi.org/10.1037/0033-295X.110.2 .316 Killcross, S., & Coutureau, E. (2003). Coordination of actions and habits in the medial prefrontal cortex of rats. Cerebral Cortex, 13, 400 – 408. http://dx.doi.org/10.1093/cercor/13.4.400 Knowlton, B. J., Mangels, J. A., & Squire, L. R. (1996, September 6). A neostriatal habit learning system in humans. Science, 273, 1399 –1402. http://dx.doi.org/10.1126/science.273.5280.1399 Love, B. C., Medin, D. L., & Gureckis, T. M. (2004). SUSTAIN: A network model of category learning. Psychological Review, 111, 309 – 332. http://dx.doi.org/10.1037/0033-295X.111.2.309 Maddox, W. T., & Ashby, F. G. (1993). Comparing decision bound and exemplar models of categorization. Perception & Psychophysics, 53, 49 –70. http://dx.doi.org/10.3758/BF03211715 Maddox, W. T., Ashby, F. G., & Bohil, C. J. (2003). Delayed feedback effects on rule-based and information-integration category learning. Journal of Experimental Psychology: Learning, Memory, and Cognition, 29, 650 – 662. http://dx.doi.org/10.1037/0278-7393.29.4.650 Maddox, W. T., Ashby, F. G., Ing, A. D., & Pickering, A. D. (2004). Disrupting feedback processing interferes with rule-based but not information-integration category learning. Memory & Cognition, 32, 582–591. http://dx.doi.org/10.3758/BF03195849 Maddox, W. T., Bohil, C. J., & Ing, A. D. (2004). Evidence for a procedural-learning-based system in perceptual category learning. Psychonomic Bulletin & Review, 11, 945–952. http://dx.doi.org/10.3758/ BF03196726 Mishkin, M., Malamut, B., & Bachevalier, J. (1984). Memories and habits: Two neural systems. In G. Lynch, J. L. McGaugh, & N. M. Weinberger (Eds.), The neurobiology of learning and memory (pp. 65–77). New York: Guilford Press. Mitchell, J. A., & Hall, G. (1988). Caudate-putamen lesions in the rat may impair or potentiate maze learning depending upon availability of stimulus cues and relevance of response cues. Quarterly Journal of Experimental Psychology B: Comparative and Physiological Psychology, 40, 243–258. Moody, T. D., Bookheimer, S. Y., Vanek, Z., & Knowlton, B. J. (2004). An implicit learning task activates medial temporal lobe in patients with Parkinson’s disease. Behavioral Neuroscience, 118, 438 – 442. http://dx .doi.org/10.1037/0735-7044.118.2.438 Nomura, E. M., Maddox, W. T., Filoteo, J. V., Ing, A. D., Gitelman, D. R., Parrish, T. B., . . . Reber, P. J. (2007). Neural correlates of rule-based and

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

PROCEDURAL LEARNING information-integration visual category learning. Cerebral Cortex, 17, 37– 43. http://dx.doi.org/10.1093/cercor/bhj122 O’Keefe, J., & Nadel, L. (1978). The hippocampus as a cognitive map. New York: Oxford University Press. Otto, A. R., Gershman, S. J., Markman, A. B., & Daw, N. D. (2013). The curse of planning: Dissecting multiple reinforcement-learning systems by taxing the central executive. Psychological Science, 24, 751–761. http://dx.doi.org/10.1177/0956797612463080 Parent, A., & Hazrati, L.-N. (1995). Functional anatomy of the basal ganglia: II. The place of subthalamic nucleus and external pallidum in basal ganglia circuitry. Brain Research Reviews, 20, 128 –154. http://dx .doi.org/10.1016/0165-0173(94)00008-D Paul, E. J., & Ashby, F. G. (2013). A neurocomputational theory of how explicit learning bootstraps early procedural learning. Frontiers in Computational Neuroscience, 7, 177. http://dx.doi.org/10.3389/fncom.2013 .00177 Poldrack, R. A., Clark, J., Paré-Blagoev, E. J., Shohamy, D., Creso Moyano, J., Myers, C., & Gluck, M. A. (2001, November 29). Interactive memory systems in the human brain. Nature, 414, 546 –550. http:// dx.doi.org/10.1038/35107080 Poldrack, R. A., & Gabrieli, J. D. (2001). Characterizing the neural mechanisms of skill learning and repetition priming: Evidence from mirror reading. Brain, 124, 67– 82. http://dx.doi.org/10.1093/brain/124 .1.67 Poldrack, R. A., & Packard, M. G. (2003). Competition among multiple memory systems: Converging evidence from animal and human brain studies. Neuropsychologia, 41, 245–251. http://dx.doi.org/10.1016/ S0028-3932(02)00157-4 Poldrack, R. A., Prabhakaran, V., Seger, C. A., & Gabrieli, J. D. E. (1999). Striatal activation during acquisition of a cognitive skill. Neuropsychology, 13, 564 –574. http://dx.doi.org/10.1037/0894-4105.13.4.564 Reber, P. J., Gitelman, D. R., Parrish, T. B., & Mesulam, M. M. (2003). Dissociating explicit and implicit category knowledge with fMRI. Journal of Cognitive Neuroscience, 15, 574 –583. http://dx.doi.org/10.1162/ 089892903321662958 Schacter, D. L., Wagner, A. D., & Buckner, R. L. (2000). Memory systems of 1999. In E. Tulving & F. I. M. Craik (Eds.), Oxford handbook of memory (pp. 627– 643). New York: Oxford University Press. Schroeder, J. P., Wingard, J. C., & Packard, M. G. (2002). Post-training reversible inactivation of hippocampus reveals interference between memory systems. Hippocampus, 12, 280 –284. http://dx.doi.org/ 10.1002/hipo.10024 Schwarz, G. (1978). Estimating the dimension of a model. Annals of Statistics, 6, 461– 464. http://dx.doi.org/10.1214/aos/1176344136 Spiering, B. J., & Ashby, F. G. (2008). Response processes in informationintegration category learning. Neurobiology of Learning and Memory, 90, 330 –338. http://dx.doi.org/10.1016/j.nlm.2008.04.015 Squire, L. R. (2004). Memory systems of the brain: A brief history and current perspective. Neurobiology of Learning and Memory, 82, 171– 177. http://dx.doi.org/10.1016/j.nlm.2004.06.005 Takane, Y., & Shibayama, T. (1992). Structures in stimulus identification

1401

data. In F. G. Ashby (Ed.), Probabilistic multidimensional models of perception and cognition (pp. 335–362). Hillsdale, NJ: Erlbaum. Taylor, J. A., & Ivry, R. B. (2012). The role of strategies in motor learning. In A. Kingstone & M. B. Miller (Eds.), The year in cognitive neuroscience: Annals of the New York Academy of Sciences (Vol. 1251, pp. 1–12). New York: New York Academy of Sciences. http://dx.doi.org/ 10.1111/j.1749-6632.2011.06430.x Treutwein, B., Rentschler, I., & Caelli, T. (1989). Perceptual spatial frequency—Orientation surface: Psychophysics and line element theory. Biological Cybernetics, 60, 285–295. http://dx.doi.org/10.1007/ BF00204126 Tulving, E., & Markowitsch, H. J. (1998). Episodic and declarative memory: Role of the hippocampus. Hippocampus, 8, 198 –204. http://dx.doi .org/10.1002/(SICI)1098-1063(1998)8:3⬍198::AID-HIPO2⬎3.0.CO; 2-G Waldron, E. M., & Ashby, F. G. (2001). The effects of concurrent task interference on category learning: Evidence for multiple category learning systems. Psychonomic Bulletin & Review, 8, 168 –176. http://dx.doi .org/10.3758/BF03196154 Willingham, D. B. (1998). A neuropsychological theory of motor skill learning. Psychological Review, 105, 558 –584. http://dx.doi.org/ 10.1037/0033-295X.105.3.558 Willingham, D. B., Wells, L. A., Farrell, J. M., & Stemwedel, M. E. (2000). Implicit motor sequence learning is represented in response locations. Memory & Cognition, 28, 366 –375. http://dx.doi.org/10.3758/ BF03198552 Worthy, D. A., Markman, A. B., & Maddox, W. T. (2013). Feedback and stimulus-offset timing effects in perceptual category learning. Brain and Cognition, 81, 283–293. http://dx.doi.org/10.1016/j.bandc.2012.11.006 Yin, H. H., & Knowlton, B. J. (2004). Contributions of striatal subregions to place and response learning. Learning & Memory, 11, 459 – 463. http://dx.doi.org/10.1101/lm.81004 Yin, H. H., & Knowlton, B. J. (2006). The role of the basal ganglia in habit formation. Nature Reviews Neuroscience, 7, 464 – 476. http://dx.doi.org/ 10.1038/nrn1919 Yin, H. H., Knowlton, B. J., & Balleine, B. W. (2004). Lesions of dorsolateral striatum preserve outcome expectancy but disrupt habit formation in instrumental learning. European Journal of Neuroscience, 19, 181–189. http://dx.doi.org/10.1111/j.1460-9568.2004.03095.x Yin, H. H., Knowlton, B. J., & Balleine, B. W. (2005). Blockade of NMDA receptors in the dorsomedial striatum prevents action-outcome learning in instrumental conditioning. European Journal of Neuroscience, 22, 505–512. http://dx.doi.org/10.1111/j.1460-9568.2005.04219.x Yin, H. H., Ostlund, S. B., & Balleine, B. W. (2008). Reward-guided learning beyond dopamine in the nucleus accumbens: The integrative functions of cortico-basal ganglia networks. European Journal of Neuroscience, 28, 1437–1448. http://dx.doi.org/10.1111/j.1460-9568.2008 .06422.x Zeithamova, D., & Maddox, W. T. (2006). Dual-task interference in perceptual category learning. Memory & Cognition, 34, 387–398. http:// dx.doi.org/10.3758/BF03193416

(Appendices follow)

CROSSLEY AND ASHBY

1402

Appendix A

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

Stimulus Transformation This appendix describes the method we used to generate the spatial frequency, orientation (f, o) pairs that define our stimuli. Spatial frequency values, f, carry units of cycles per degree of visual angle, and orientation values carry units of radians. First, (x, y) pairs were generated from a bivariate uniform distribution on the interval (0, 100) for each dimension. Next, (x, y) pairs were linearly transformed into (x=, y=) pairs to span the interval (⫺1, 2) ␲ x 3␲ on dimension x= and , ⫹ on dimension y= via 11 8 11





x⬘ ⫽

3x 100

⫺1

We used a multistep procedure to convert y= values in o values. First, we collected and sorted in ascending order all y= values into a vector ys. From ys, we defined new vectors z ⫽ 4.7sin 2ys and y⬙(n) ⫽ y⬙(n ⫺ 1) ⫹ 兹(ys(n) ⫺ ys(n ⫺ 1))2 ⫹ (z(n) ⫺ z(n ⫺ 1))2 , (A.5) where the n terms reference the nth element of the corresponding vector, and

(A.1)

and

(A.4)

y⬙(1) ⫽ 兹ys2(1) ⫹ z2(1).

(A.6)

Y⬙ was then converted into a vector of stimuli orientations, o, via y⬘ ⫽

3␲ y 8 100



␲ 11

.

Next, x= values were mapped to spatial frequency, f, values via f ⫽ 2x⬘ .

O⫽

(A.2)

(A.3)

y⬙(max ys ⫺ min ys) max y⬙ ⫺ min y⬙

⫺ min y⬙ ⫹ min ys .

(A.7)

Finally, the elements of o were returned to their original sort order and recombined into (f, o) pairs.

(Appendices continue)

PROCEDURAL LEARNING

1403

Appendix B Decision-Bound Models Used in Experiments 1 and 2 The models described in this appendix have been used in many previous studies. For more details, see Ashby (1992) or Maddox and Ashby (1993).

Rule-Based Models

This document is copyrighted by the American Psychological Association or one of its allied publishers. This article is intended solely for the personal use of the individual user and is not to be disseminated broadly.

One-Dimensional Classifier This model assumes that participants set a decision criterion on a single stimulus dimension. For example, a participant might base his or her categorization decision on the following rule: “Respond A if the bar width is small; otherwise, respond B.” Two versions of the model were fit to the data. One version assumed a decision based on bar width, and the other version assumed a decision based on orientation. These models had two parameters: a decision criterion along the relevant perceptual dimension and a perceptual noise variance.

General Conjunctive Classifier (GCC) Four versions of the GCC (Ashby, 1992) were fit to the data. One version assumed that the rule used by participants was a conjunction of this type: “Respond A if the length is short and the orientation is shallow (e.g., less than 45°); respond B otherwise.” Another version assumed that the rule used by participants was a conjunction of this type: “Respond A if the length is short and the orientation is steep (e.g., less than 45°); respond B otherwise.” Another version assumed that the rule used by participants was a conjunction of this type: “Respond A if the length is long and the orientation is shallow (e.g., less than 45°); respond B otherwise.” Another version assumed that the rule used by participants was a conjunction of this type: “Respond A if the length is long and the orientation is steep (e.g., less than 45°); respond B otherwise.” All versions had three parameters: one for the single decision criterion placed along each stimulus dimension (one for orientation and one for bar width) and a perceptual noise variance.

Information-Integration (II) Model: General Linear Classifier (GLC) The GLC assumes that participants divide the stimulus space using a linear decision bound. Categorization decisions are then

based on which region each stimulus is perceived to fall in. These decision bounds require linear integration of both stimulus dimensions, thereby producing an II decision strategy. The GLC has three parameters: the slope and intercept of the linear decision bound and a perceptual noise variance.

Random-Guessing Models Guessing Model This model assumes that the participant guesses randomly and that all responses are equally likely. Thus, the predicted probability of responding A or B is .50. This model has no free parameters.

General Random Responder Model This model assumes random guessing but that one response is more likely than the other. Thus, the predicted probabilities of responding A and B are parameters that are constrained to sum to 1 (i.e., so this model has one free parameter).

Model Fitting and Selection of Best Fitting Models Each of these models was fit separately to each block of data from each participant. The model parameters were estimated using maximum likelihood (Ashby, 1992) and the goodness-of-fit statistic was BIC ⫽ ⫺2ln(L) ⫹ kln(n), where k is the number of free parameters, and L is the likelihood of the model given the data (Akaike, 1974; Schwarz, 1978; Takane & Shibayama, 1992). The BIC (Bayesian information criterion) statistic penalizes models with extra free parameters to compensate for the possibility of overfitting. The best fitting model is the one with the lowest BIC score. Received August 7, 2013 Revision received October 14, 2014 Accepted December 5, 2014 䡲

Procedural learning during declarative control.

There is now abundant evidence that human learning and memory are governed by multiple systems. As a result, research is now turning to the next quest...
2MB Sizes 0 Downloads 11 Views