Journal of Experimental Psychology: Human Perception and Performance 2014, Vol. 40, No. 3, 1072-1091

© 2014 American Psychological Association 0096-1523/14/$12.00 DOI: 10.l037/a0035648

Deconstructing Mental Rotation Axel Larsen University of Copenhagen A random walk model of the classical mental rotation task is explored in two experiments. By assuming that a mental rotation is repeated until sufficient evidence for a match/mismatch is obtained, the model accounts for the approximately linearly increasing reaction times (RTs) on positive trials, flat RTs on negative trials, false alarms and miss rates, effects of complexity, and for the number of eye movement switches between stimuli as functions of angular difference in orientation. Analysis of eye movements supports key aspects of the model and shows that initial processing time is roughly constant until the first saccade switch between stimulus objects, while the duration of the remaining trial increases approximately linearly as a funcdon of angular discrepancy. The increment results from additive effects of (a) a linear increase in the number of saccade switches between stimulus objects, (b) a linear increase in the number of saccades on a stimulus, and (c) a linear increase in the number and in the duration of fixations on a stimulus object. The fixation duration increment was the same on simple and complex trials (about 15 ms per 60"), which suggests that the critical orientation alignment take place during fixations at very high speed. Keywords: mental rotation, eye movements, visual working memory, random walk

Our ability to determine that objects have the same shape despite differences in orientation or size is a classical problem in visual perception. It was thoroughly discussed at the turn of the 19th century by Mach (1902) and since has been treated by numerous authors (e.g., Biederman, 1987; Dodwell, 1970; Edelman, 1995; Furmanski & Engel, 2000; Gibson, 1969; Graf, 2006; Hebb, 1949; Hodgetts, Hahn, & Chater, 2009; Köhler, 1929; Lashley, 1942; Pitts & McCulloch, 1947; Rock, 1956; Tarr & Gauthier, 1998). In their landmark study on mental rotation some 40 years ago R. N. Shepard and Metzler (1971) opened a new line of attack that was followed by related studies on visual transformations of size (Bundesen & Larsen, 1975; Sekuler & Nash, 1972), mental scanning of images maintained in visual short term memory (Kosslyn, 1973; Kosslyn, 1980), and mental translation of visual images (Larsen & Bundesen, 1998). R. N. Shepard and Metzler displayed projections of two unfamiliar three-dimensional figures on a computer screen and recorded the time subjects needed to decide whether the objects had the same shape as a function of their angular difference in orientation. They found that reaction time (RT) costs increased linearly as a function of angular discrepancy

at about 1 s per 60 degrees, and that this rate was roughly the same for rotations in the picture plane and rotations in depth. R. N. Shepard and Metzler (1971; see also. Metzler & Shepard, 1974) offered an interpretation of their findings that was straightforward and intuitively compelling, but also radically different from previous attempts to understand orientation invariance (e.g., Selfridge, 1959; Sutheriand, 1968). The interpretation was essentially based on introspection: All subjects claimed that they imagined one of the figures rotated into the same orientation as the other one and that they could carry out this mental rotation at no greater than a certain limiting rate. The notion that mental rotation should be conceived in close analogy to actually perceiving a rotating physical object was developed and extensively tested in further studies (R. N. Shepard & Cooper, 1982). Of particular note is R. N. Shepard and Judd's study (1976) on stroboscopic motion in which they displayed the very same three-dimensional objects in sequential alternation (zero ISI). With a suitable stimulus onset asynchrony (SOA), participants reported vivid impressions of a rigid three-dimensional object rotating back and forth in the picture plane or in depth. The critical SOA at which the impression of rigid rotational motion broke down increased approximately linearly as a function of angular difference in orientation with nearly the same slopes for rotations in the picture plane or in depth. The linear increase suggests that there is an upper limit to the velocity of the rotational movement. Farrell, Larsen, and Bundesen (1982) showed that the limit relates to angular velocity, and not the linear velocity of the fastest moving subpattem. Further evidence on the close relationship between transformation of visual images in mental rotation and visual motion perception comes from studies of the motion aftereffect (MAE), which show that the MAE interferes with mental rotation (see, e.g., Corballis & McLaren, 1982; Heil, Bajric, Rosier, & Henninghausen, 1997; Jolicoeur, Corballis, & Lawson, 1998; Seurinck, de Lange, Achten, & Vingerhoets, 2011). In line with the investiga-

This article was published Online First February 10, 2014. Axel Larsen, Center for Visual Cognition, Department of Psychology, University of Copenhagen, Denmark. This research was financially supported by the Nordic Council (NOS-S) to the Nordic Center of Excellence in Cognitive Control. I thank Claus Bundesen for critical comments and helpful suggestions, and Martin Lange for programming synchronization protocols between Eyelink II and the PC graphic display system. Correspondence concerning this article should be addressed to Axel Larsen, Center for Visual Cognition, Department of Psychology, University of Copenhagen, 0ster Farimagsgade 2A, DK-1353 Copenhagen K, Denmark. E-mail: [email protected]

1072

DECONSTRUCTING MENTAL ROTATION tions in visual psychophysics a meta-analysis of 32 investigations of brain activations during mental rotation (Zacks, 2008) reveals that all experiments using transformation specific contrasts (i.e., within-task comparisons of effects of mental rotation, e.g., comparing large rotations with small rotations) have found activations located about (-47,5, -59.5, -10.0, in Talairach space) that corresponds to the visual motion area (V5/MT-I-). R. N. Shepard and Metzler's (1971) interpretation of mental rotation as an unitary or holistic visual process has been disputed in numerous studies ever since (e.g., Anderson, 1978; Folk & Luce, 1987; Liesefeld & Zimmer, 2013; Pylyshyn, 1973, 2003). Besides the philosophical and theoretical aspects of the dispute, the experimental support for Shepard and Metzler's original claim also has been questioned. In particular it has turned out that estimates of the rate of mental rotation varies tremendously as a function of stimulus complexity, stimulus familiarity, training, and similarity within negative stimulus pairs (e.g., Bethell-Fox & Shepard, 1988; Cohen & Kubovy, 1993; Dahlstrom-Hakki, Pollatsek, Fisher, Miller, & Rayner, 2008; Folk & Luce, 1987; Förster, Gebhardt, Lindlar, Siemann, & Delius, 1996; Pylyshyn, 1979; Yuille & Steiger, 1982; but see also. Cooper & Podgomy, 1976). In addition, the systematic deviation from linear RT functions in some studies (Bundesen, Larsen, & Farrell, 1981; Cooper & Shepard, 1973) is puzzling and difficult to reconcile with R. N. Shepard and Metzler's original interpretation (but see, Searle & Hamm, 2012). This paper has two main goals. The first goal is to demonstrate a principle that, in part, may explain the highly inconsistent estimates of mental rotation velocity. This is done by an explicit computational model of performance in a typical mental rotation task. The model assumes that Mmited capacity in visual short-term memory (VSTM) severely constrains the amount of information that is encoded from one of the stimuli as a (more or less degraded) visual image, and hence that when the transformed visual image is matched against the other stimulus the evidence does not suffice to meet the requirement to respond as quickly as possible, while keeping errors low. Therefore this process of encoding, transformation, and match, is repeated until the evidence sampled suffices to meet instructions. Unlike structural models (e.g., Biederman, 1987), the random walk model is based on well-established properties of visual short-term memory. Visual short-term memory (STM) capacity is modest. Sperling (1960) showed that the number of letters we can read off from a visual image of a briefly exposed stimulus display is about four or five. Later studies (e.g., Bundesen, 1990; Bundesen, Pedersen, & Larsen, 1984; Luck & Vogel, 1997;,Pashler, 1988; Shibuya & Bundesen, 1988) generally reported a somewhat lower VSTM capacity between three or four alphanumeric items. The capacity to retain colored visual shapes (Todd & Marois, 2004; Vogel & Machizawa, 2004) is also about three or four, but when the composition of stimuli in terms of component features gets more complex, capacity, that is, number of objects retained, seems to decrease (e.g., Alvarez & Cavanagh, 2004; S0rensen & Kyllingsbaek, 2012; Wheeler & Treisman, 2002). Is the VSTM store hypothesized in these studies identical to the store in which a few subpatterns (or features) of a stimulus is encoded and subsequently transformed with respect to orientation? Or, to put it more directly, is mental rotation of visual images done in VSTM? To answer the question it is instructive to note that the successive stimulus presentation in the popular change detection

1073

paradigm (Luck & Vogel, 1997; Pashler, 1988; Phillips, 1974), which is often used to estimate VSTM capacity, is not essentially different from the stimulus presentation in the successive matching paradigm (see later) that is used to invoke mental rotation when the stimuli differ with respect to orientation. Basically the two paradigms only differ with respect to the task participants are requested to solve. In change detection, the task is to report if the second stimulus is changed relative to first one. Instructions in the successive matching are also to determine whether the second stimulus have changed relative to the first, except for irrelevant changes in orientation. It may be that different aspects of the first stimulus are encoded in the two paradigms, but it is natural to assume that stimulus encoding in each case is done into VSTM. Two recent studies (Hyun & Luck, 2007; Prime & Jolicoeur, 2010) also suggested an affirmative answer. Hyun and Luck used a dual task paradigm and found interference between storage of colors supposedly maintained in VSTM and mental rotation, but no interference on mental rotation with non-VSTM short-term storage of positions in space. Prime and Jolicoeur (2010) monitored evoked potentials released by mental rotation and showed that the duration of the electrophysiological signature that is coupled to maintenance of information in VSTM correlated with angular difference, and thus by hypothesis to the duration of mental rotation. The second goal was to map the contributions and possible organization of the various processing components that generate the systematic linear increase in response times as a function of angular difference in orientation between stimuli. This is done in Experiment 2 by monitoring saccades and fixations during task performance, and by systematically relating their number and duration to the difference in the orientation.

Experiment 1 In this experiment the principal goal was to explore if visual performance in the classical R. N. Shepard and Metzler (1971) paradigm may emerge as a result of repeatedly executing a mental rotation of a more or less schematic visual image of one of the stimulus objects followed by a comparison of the transformed visual image with the other stimulus. To achieve this goal mental rotation was investigated with two different paradigms: Simultaneous matching in which two stimuli in different orientation aie presented side by side at the same time, and successive matching in which stimuli in different orientation are shown one at a time in succession. To solve the successive matching task the first stimulus must be encoded and maintained as a visual image in VSTM until the onset of the second stimulus, and then brought into alignment with the second stimulus. A mental rotation can only be made once in this case. In contrast, multiple mental rotations may be done in the simultaneous matching task. In particular, it should be possible to determine whether performance in the simultaneous matching task can be modeled as repetitions of the mental rotation that is done but once in the successive matching task. For convenience, the mental rotation that is done but once in successive matching and the mental rotation that by hypothesis is done repeatedly in simultaneous matching are both labeled as simple mental rotations. In both cases, mental rotation is presumably done while the observer by one or more fixations and saccades inspect one stimulus object at a time.

1074

LARSEN

The notion that mental rotation inferred fi'om linearly increasing RTs as funcdon of angular discrepancy may be based on repeated application of a simple mental rotation is not new, but it has never been pursued in detail, let alone rigorously specified in a computadonal model. It derives from observadons made by several researchers (e.g.. Carpenter & Just, 1978; Metzler & Shepard, 1974) that when (fairly complex) stimuli are displayed side by side at the same dme, participants may often go back and forth between stimuli several dmes before reaching a decision. Presumably they do this because their VSTM capacity only allow them to encode, mentally rotate, and compare a schematic image or a few fragments of a stimulus at a dme. A natural formalization of these observadons is to model visual performance in the classical mental rotadon task with two simultaneously displayed tri-dimensional objects as a random walk with a constant step size between two boundaries (thresholds) that represent the evidence needed for reaching a positive (match) or negadve (mismatch) decision. The walk continues one step at a dme undl one of the boundaries is reached. A step toward one of the boundaries cancels a step toward the other boundary. Boundary values and the probability of stepping toward the posidve or negadve threshold are treated as free parameters. For each random step then, a global processing module is executed. For positive stimulus pairs the module comprises encoding a schemadc visual image of one of the objects a simple mental rotadon of the visual image of the object, and a subsequent test for a match against the other object. Each dme the module is executed; one unit of perceptual evidence is accumulated. On trials with congruent stimulus pairs the probability of collecting a unit of posidve evidence is represented by the parameter, p^. The probability of coUecdng a unit of negadve evidence is 1 - p"^. The processing module is repeated undl the accumulated evidence exceeds a positive (X^"*") or a negative threshold (X^ ). For the incongruent (negative) stimulus pairs, the same processing module of sdmulus encoding, mental rotadon, and comparison is executed. The probability of drifting one unit toward the negative threshold is then set to p~, and the probability for drifting toward the posidve threshold to 1 - p~. When the drift parameters p^ andp^ and thresholds X"^ and X~ are known, the general theory of random walks provides explicit formulas for calculating false alarm and miss rates and mean number of steps to reach upper and lower thresholds (see, e.g., Bundesen, 1982; Feller, 1970). Mean number of steps multiplied by the time to execute the encoderotate-compare processing module added to a base RT was used to predict positive and negative RTs. Studies of mental rotadon generally show that error rates increase as a function of angular difference, which suggests that response criteria change as a funcdon of angular difference. A convenient and straightforward way to accommodate this in a random waUc model framework is to let posidve and negadve thresholds vary as a linear funcdon of angular difference orientation.

Method Participants. Three paid undergraduates and one graduate student between 24 and 29 years participated in the study. The participants were males, naive with respect to the purpose of this experiment, but otherwise fairly well-trained experimental participants. All had normal or corrected-to-normal vision.

Stimuli and apparatus. The stimulus material was a subset of the original stimuli' used by R. N. Shepard and Metzler (1971). The set comprised of five three-dimensional prototypes portrayed from seven different views, from which any angular difference between 0" and 180° in steps of 20° could be constructed. The stimuli were displayed on a computer monitor. Each view was inscribed in a circle with a diameter of 14 cm on the screen and viewing distance was 60 cm. There were two conditions; A simultaneous condition in which two views of the same prototype were shown at the center of the screen side by side at the same dme undl the participant responded, and a successive condition in which the two views were displayed one at a time at the center of the screen. In the simultaneous condition the shortest distance between the perimeters of the inscribing circles was 3 cm. In the successive condidon the first view was displayed for 1,500 ms, and the second view remained visible until the participant responded. The intersdmulus interval between the first and the second view was 1,500 ms. Intertrial interval were 1,250 ms in the simultaneous condidon and 1,750 ms in the successive condition. In the simultaneous matching condidon the angular difference between the views of the prototypes was 0°, 20°, 40°, 60°, 80°, 100°, 120°, 140°, 160°, and 180° about the vertical axis. For each prototype and for each angular difference in orientadon there was one posidve and one negadve trial. Negadve trials were idendcal to posidve trials except for a replacement of one pattern with its mirror image with respect to the frontal plane. A block of simultaneous matching trials thus comprised 50 posidve and 50 negadve trials. The composidon of sdmulus pairs in the successive matching condition was the same as in the simultaneous matching condition. Because error rates approached 40 to 50% at large difference in orientation in pilot experiments, only a restricted range (0°, 20°, 40°, 60°, 80°) of five angular differences in orientation was investigated. A block of successive matching trials thus comprised 25 positive and 25 negative trials. The direction of shortest rotadon path was always the same; In the successive matching condidon the leftmost parts of the sdmuli were always rotated about the vertical axis away from the viewer into the screen; in the simultaneous condidon the shortest path was to turn the leftmost parts of the left stimulus about the vertical axis into the screen. There were eight blocks of successive matching trials interleaved with eight blocks of simultaneous matching trials in an ABAB . . . sequence starting with the simultaneous matching condidon (two participants) or with the successive matching condidon (two participants). Within each block the sequence of trials was randomized anew for each participant. Procedure. The participants were tested individually and were asked to determine if the two patterns in a pair were congruent, disregarding differences in angular departure, if any, and to respond quickly with only a few errors. Accuracy was fed back after each trial in the lower left comer of the screen. Participants pracdced for about a quarter of hour and were informed of the possible changes of the direction and magnitude (up to 180°) of the orientadon of the sdmuli. The experiment was self-paced between blocks and took about 3 to 4 hr to complete, including breaks. ' Roger Shepard and Michael Tarr have made copies of the original tri-dimensional stimuli publicly available. The stimulus material was retrieved from http://www.cog.brown.edu/-tarr/stimuli.htmltfsh

DECONSTRUCTING MENTAL ROTATION

Results Subjective reports. The participants were debriefed after compledng the experiment. They reported that they focused on one or more characteristic subpattems on the first stimulus object in the successive matching condidon and at the same time tried to keep the general spadal layout of the stimulus in mind. At the onset of the second stimulus, they looked for the memorized characteristic subpattems and mentally rotated the image to fit the corresponding subpattems in the stimulus. When the stimuli were displayed side by side at the same time, the participants used a similar strategy of encoding one of the stimuli with particular attention to prominent details or subpattems. Corresponding details in the other stimulus were then idendfied and confirmed (or falsified) after mentally rotadng the image to fit the other stimulus. In many cases this process was repeated, sometimes after encoding new characteristic features. In some cases the participants claimed that they immediately realized that stimulus pattems were mirror image pairs. Response latencies. All RTs for correct responses were analyzed. Figures 1 and 2 illustrate group mean RTs and response accuracies for congruent and incongment pairs, respecdvely. The angular difference within negadve pairs in Figure 2 corresponds to the angle between the congruent stimuli prior to the replacement of one of the stimulus pattems by its reflection in the frontal plane. To determine effects of mode of presentadon (simultaneous vs. successive matching) angular difference was only analyzed at the five levels (0°, 20°, 40", 60°, 80°) common to both modes. The overall effects of mode of presentadon, F(I, 3) = 22.88, p = .02, Tip = .88, type of stimulus pair (positive vs. negative), E(l, 3) = 15.45, p = .03, Ti^ = .84, and angular difference in orientadon, F(4, 12) = 10.65, p = .02, -qj = .78, were significant.^ The interaction between angular difference in orientadon and type (posidve or negadve) of sdmulus pair, F(4, 12) = 6.97, p = .03, Tip = .70, was also significant. Regardless of mode of presentation, the effect of angular difference was only significant on positive trials, F(4, 12) = 25.79, p = .01, Ti^ = .90 and F(4, 12) = 8.80, p = .03, TI^ = .75, in the simultaneous and successive matching task, respectively. There was a reliable linear component in both tasks, F( 1, 3) = 17.22, p = .025, r\l = .85, successive matching, and F(l, 3) = 45.86,p = .01, Tip = .94, simultaneous matching, respecdvely. The interacdon between mode of presentadon (see Figure 1) and angular difference was reliable, F(4, 12) = 13.53, p = .02, TI^ = .82. Random walk model. Modeling the visual behavior of each participant in the simultaneous matching condition is based on the assumption that a subset of the cognitive procedures that can only be executed once in the successive matching condition is repeatedly executed (serially) in the simultaneous matching condidon undl the accumulated evidence for a match or mismatch exceeds a fixed threshold. The reciprocal to the rate of mental rotation, a, is treated as a free parameter in the random walk model. For comparison purposes this rate, and the inverse to the rate of mental rotadon esdmated by the linear slope constants in least chi-square fits to successive and simultaneous matching RTs (a^^^ and o£s¡^, respectively) is displayed in Table 1. Let the duradon of the encoding process in the simultaneous matching condidon be fEncodc '^t the dme taken to mentally rotate a visual image through the angle v be av, where a is a constant,^ and let

1075

the duradon of the comparison process be icompare- Then the dme taken to execute these processes once equals ißncode + av -I- i and the total RT when repeating them n dmes is given by RT =

+ « (T'EncodeCompare + «v), for pOSidve reSpOnSeS .

(la) RT = ./?7^aseime + «^'EncodeComparc for negative responses, (lb) where i?TBaseiine and RTBaxim represent base RTs for positive and negative responses, and «Encode + 'compare is collapsed into the parameter rEn^-odeCompare- It seems likely that participants also encode, mentally rotate, and match a schemadc visual image on negadve trials. Presumably this adds a constant latency regardless of angular difference to every negadve response. This constant latency may be represented as a component of the /ÎTiaseiine parameter. In Experiment 1,1 made the simplifying assumpdon that participants on average on the negadve pairs did perform a mental rotation corresponding to an angular difference of 90°. Thus, Equadon lb was replaced by Equation lc. RT -

impare

• «90).

(lc)

The theory of random walks (see Feller, 1970) provides the mathemadcal foundadon for computing the number of repeddons n to reach a specified state of evidence, when the probability of a gathering one unit of evidence as a result of the comparison is known. To model visual performance, let the inidal evidence for a posidve or negadve response at the beginning of a trial be zero, and let the probability of moving toward the posidve threshold X.^ and accumulating one unit of evidence in favor of a match be p. Then, the probability of moving toward the negative threshold K~ and accumulating one unit of evidence for a mismatch equals I - p. Evidence is accumulated such that one unit of evidence favoring a match cancels one unit of evidence favoring a mismatch. As proven by Feller (1970, p.353; see also Bundesen, 1982; Larsen, McDhagga, & Bundesen, 1999) the probability (u„) of reaching the negadve threshold after just n steps is given by u„ = a Tp 2 ( 1 - p ) 2 >. cos" —sin—sin—, for n > z , i=i a a a (2) where a is the distance between X"^ and X.^, and z is the distance from X~ to zero. If n < z, then «„ ^ 0, and if « = z, then «„ ^ (1 - p)". The probability (U^ of reaching the negative threshold after n steps or less is given by. (3)

" Repeated-measures analysis of variance with alpha level equal to .05 throughout, and with Greenhouse-Geisser adjustment when applicable (Tip represent partial eta-squared). •* The reciprocal of the linear slope constant a is usually interpreted as the velocity of mental rotation. However, as will be clear from the analysis of eye movements, a represent the combined effect of saccades and fixations.

LARSEN

1076 3500

3000 -



Simultaneous Successive

1

2500 • •




\ T 1

,
except for the hypothetical contribution due to mental rotation. Assuming that participants on average execute a mental rotation of 90° on negative trials with simple stimuli, 7simEncode should then equal TsimEncode + ^O'asj^pi^. On negative trials with complex sdmuli, 7î;omEncode should Ukewise equal T^omE^^oáe + 90*acon,piexThe correlation between predicted encoding and comparison dmes on negative trials based on these assumptions and the observed data (see Table 4) is .79. In five cases, predictions were too high, in six cases too low, and in one case almost exact.

DECONSTRUCTING MENTAL ROTATION

1085

275

250

225

200 g

I

175

O «s

150

c

275

Q

CO

Second Pass

250

225

0

T

200

175

Î

• O

First Pass

Simple Complex

150

30

60

90

120

150

180

Angular Difference in Orientation Figure 7. Experiment 2: Group means of first and second pass mean fixation duration on positive trials as a function of angular difference in orientation with stimulus complexity as a parameter. Top panel: Second pass fixations fitted by a least chi-square straight line with the same intercept and slope constant on simple and complex trials, x^(72) = 92,13, p = .06. Bottom panel: First pass fixations. Solid and dashed lines represent averaged minimum chi-square zero slope straight lines to the data points, x^(72) - 90.01, p = .07. Vertical bars around each symbol show standard errors of group means.

Mental rotadon may thus be a component in some proportion of negative responses in Experiment 1 and 2. To be sure, negadve responses may rely on other processes, which have yet to be revealed.

Nonlinear Predictions Figures 4 and 6 show a few dips in the fits. This is because the mean number of steps to a threshold as a funcdon of angular difference is not linearly related to the linear change in threshold settings. For example, a linear increase (or decrease) in posidve thresholds (see Tables 2 or 4) as a funcdon of angular discrepancy leads to a positive (or negative) acceleration of the mean number of steps to reach the posidve threshold. Thus, by averaging across pardcipants the resulting fits may approach straight lines but wiH rarely be strictly linear.

Orientadon invariance may be achieved without using mental rotadon, for example by using verbal descripdons, or by fast detecdon of common features such as similai- vertices, which may result in many errors, but nevertheless be used. I find it interesting that the random walk model can, at least to some extent, account for positive trials in which mental rotadon is not used. Tables 2 and 4 show that for some participants (e.g., RG) thresholds are less than 1 and greater than - 1 , which impUes that some of the (presumably fast) responses from these participants carmot be based on mental rotadon.

Mental Rotation is Done During Fixations The duration of saccades between sdmuli, and saccades within stimuli in the second pass, is essentially a constant funcdon of angular difference in orientation, but the number of saccades both

LARSEN

1086

Table 4 Experiment 2: Constrained Random Walk Model Participants ML

RG

LL

TS

JO

PH

M

360.94 394.50 368.49 385.81 326.08 348.87 386.08 459.58 0.00 0.61 .83 .84

549.23 743.63 602.88 557.02 449.51 455.40 365.77 596.80 1.03 0.65 .92

474.20 369.86 543.01 378.82 323.85 365.43 327.85 394.87 0.00 0.16

579.70 642.23 578.12 599.88 329.05 449.7 440.34 534.05 0.64 1.37 .86 .78 ,11 .81 1.26 1.09 -1.55 -2.10 0.29 0.90 -0.27 -0.76

599.78 642.32 559.32 654.08 352.68 498.7 443.06 620.08 1.03 1.00 .84 .93 .81 .51 0.95 3.55 -1.72 -0.78 0.36 0.83 -0.58 0.38

449.76 529.06 423.99 535.41 396.76 406.42 415.62 551.89 0.37 0.49 .60 .67 .87 .64 1.25 1.06 -2.42 -2.12 -0.41 -0.29 -0.36 -0.72

502.27 553.60 512.62 518.50 362.99 420.75 396.45 526.21 0.51 0.71 .81 .82 .78 .63 1.40 1.90 -1.56 -1.49 0.17 0.42 -0.13 -0.20

Constants/parameters ÄT^imBaseüne /î7comBaseline Ä-^imBaseüne ÄTHomBaseline * SimEncode •* SimEncode TcomEncode •* ComEncode O^S imple ^Complex

Psîmple PCompiex

.72 .65

.84 .72 .65

.81 .89 .78 .50

1.58 1.68 -1.35 -1.62 0.68 0.74 0.05 -0.06

1.63 1.48 -1.09 -1.22 0.05 0.14 0.07 -0.04

1.73 2.55 -1.23 -1.08 0.07 0.20 0.29 -0.02

79.1 66 .129

67.04 56 .148

54.99 69 .890

PSimple ^Complex Í^O Simple "^0 Complex ^ 0 Simple "•0 Complex "^SimpIeSiope A.ComplexSlope "•SimpleSlope ^ComplexSlopc

Summary

df P

67.42 63 .329

81.97 57 .017

72.61 72 .458

423.12 383 .077

Note. The upper part of the table shows constants derived from direct measures of eye movements. Free parameter values in midsection and goodness of fit in bottom rows. Constants for RG and JO were treated as partially free parameters (see text and note to Table 2).

within and between stimulus objects increases approximately linearly as a function of angular difference. Together, this adds a substantial linear component to overall RT, which presumably refiects processes that prepare for the basic orientation alignment

Table 5 Experiment 2: The Velocity of Mental Rotation Participants Parameters °^Basic ^SecPassSimple ^SecPassComplex ^SuccSimple ^SuccComplex O^SimSimple ^SimComplex

ML

RG

LL

TS

JO

PH

M

-0.03 0.00 0.61 0.79 1.03 0.89 3.59

0.34 1.03 0.65 1.25 1.93 1.48 2.67

0.20 0.00 0.16 -0.25 0.43 0.22 1.54

0.08 0.64 1.37 2.25 2.34 2.62 10.71

0.07 1.03 1.00 1.47 2.80 4.74 12.08

0.04 0.37 0.49 1.50 3.40 0.68 1.54

0.12 0.51 0.71 1.10 1.71 1.99 6.12

Note. For abbreviations and conventions see text and Table 1. Prefixes Sim and Succ designate the simultaneous and successive condition, respectively. The estimates in the three upper rows are grand means based the analysis of eye movements. The first row represents the linear increase (ms/degree) in the duration of fixations in the second pass across simple and complex trials. The second and third row represents the linear increments in the simple mental rotation functions as function of complexity. The four lower rows represent grand means based on RTs. The correlation between the simple slopes in the second pass and the simple slope is successive matching is .81. RT = reaction time.

of the encoded stimulus, rather than the alignment itself, which seems solely to be done during fixations in the second pass. Three lines of evidence converge on this interpretation. First, the overall effect of angular difference on the duration of switch saccades and saccades on stimuli in the second pass is negligible and significantly different from the effect of angular difference on the duration of fixations. Second, there is evidence that mental rotation is not done (or suppressed) during saccades (Irwin & Carlson-Radvansky, 1996; Irwin &. Brockmole, 2000; but see also, Jonikaitis, Deubel, & de'Sperati, 2009). Third, in line with a number of studies that support the general hypothesis that visuospatial processing is confined to eye fixations (see the review in Irwin, 2004), fixation duration in the second pass increased approximately linearly as a function of angular difference (see Figure 7).

Integrating Local and Global Analyses of Mental Rotation The linear slope constant for the duration of fixations in the second pass is the same on simple and complex trials, which suggests that the basic orientation alignment during a fixation is done at the same velocity regardless of visual complexity. It seems natural to assume therefore that there is a basic mental rotation velocity that is refiected in the linear increase in fixation durations as a function of angular difference. It is fast (perhaps about 15 ms

DECONSTRUCTING MENTAL ROTATION

per 60°), and orders of magnitude faster than the velocity (about 1,000 ms per 60°) originally reported by R. N. Shepard and Metzler (1971). Nevertheless, by coupling the partitioning of eye movements into initial latency, first pass, saccade switches, and second pass to the parameters in random walk model it is possible to bridge this gap. This is illustrated in detail for each participant in Table 5, which in the first row shows the linear increment in the mean duration of fixations in the second pass, and next (Table 5, rows 2 and 3) the linear increment in visual processing time, measured by the mean duration of fixations and saccades on a stimulus in the second pass (i.e., the linear increment in the simple mental rotation in the random walk model). The bottom rows show slope constants, based on linear fits to RTs. As can be seen. Participant JO for example, has a mental rotation rate based on RTs about 720 ms per 60°, fairly close to the rate (1,000 ms per 60°) reported by R. N. Shepard and Metzler (1971). This rate is based on the conti-ibution from saccade switches (shown for grand means in Table 3), and JO's (simple) mental rotation rate (roughly between 100 and 180 ms per 60°) based on the average duration as a function of angular difference JO looked at one object in the second pass. This simple mental rotation slope constant may be further decomposed to the linear combination of constant saccades and linearly increasing duration of fixations. The increase in the duration of fixations in the second pass is small, implying a rapid basic orientation alignment rate (for JO about 4 ms per 60°). It is interesting that using a more complex task with visual identification of two-dimensional projections of cubes portrayed in different orientations Dahlstrom-Hakki et al. (2008) also found that response times ranging from about 10,000 ms (no difference in orientation) to about 22,500 ms (angular difference of 270°) related to repeated applications of a simple mental rotation function. This very large span in RTs characterized a group of slow male subjects, but even with much faster subjects (RT range 3,500-6,000 ms), and groups in between the very fast and very slow, a reduction of RT functions to repeated application of a simple mental rotation function appears to make very good sense. For the slow group of males, the simple mental rotation (of 0°) was repeated about 10 times with no difference in orientation between stimulus objects. When angular difference was 270° the mental rotation was repeated about 40 times. Dahlstrom-Hakki et al. (2008) defined gaze duration on a face (of a stimulus cube) as the sum of the fixations on that cube before another face was fixated. The slope of the gaze duration as a function of angular difference in orientation (i.e., the slope of the simple mental rotation function) was roughly the same for slow and fast subjects, and across subjects about 0.41 ms/degree. In Experiment 2, the corresponding slope constants (of about 0.50 and 0.79 ms/degree, see Table 5) include the conti-ibution from the linearly increasing number of saccades as a function of angular difference, which may be elicited by a need to check more subpattems because objects at large angular differences look more and more dissimilar. However, this will at most amount to about 0.03 ms/degree on the assumption that saccades on stimulus objects are about 27 ms. Also note, that more than three or four saccades on an object in the present context seems highly unlikely in view of the limited VSTM capacity of three or four items or less. Keeping this in mind, the estimates of the simple mental rotation

1087

slopes in Dahlstrom-Hakki et al. (2008) and Experiment 2 roughly appear of the same order of magnitude.

Successive Matching A major reason for running the successive matching task was to test the idea that the linear increase in the duration of the gaze (i.e., the mean of the sum of the duration of saccades and fixations) on stimuli in the second pass equaled the linear increase in successive matching RTs. Table 5 illusti-ates that this idea is contradicted by the data. As expected the correlation between the slope constants in successive matching (Table 5, rows 4 and 5) and corresponding slope constants of the simple mental rotation functions (Table 5, rows 2 and 3) is fairly high (.81). However, in 11 of 12 comparisons the slope constant was larger in the successive matching task than the corresponding slope constant for the simple mental rotation in simultaneous matching (p = .003, cf. Table 5). In the simultaneous matching task the participant has all the information needed for a decision until the trial is terminated by a response. Thus, in the simultaneous matching task, in which the participant fully controls exposure duration, it is always possible to go back and re-encode a stimulus. In response to the experimental conditions in the successive matching task on the other hand, it makes sense to encode the first stimulus thoroughly by more fixations (and saccades), and possibly encode and retain some features (e.g., verbally) in nonvisual buffers during the 2,000 ms presentation of the first stimulus. Mental rotation of this presumably richer encoded stimulus image in the successive matching condition, in which more features may be aligned with respect to orientation, should tend to generate larger slope constants in the successive matching task.

Repeating Mental Rotation The random walk model accounts for saccade switches between stimulus objects in the simultaneous matching task (see Figure 6), but offers no insight into the details of the underlying processes. Two possibilities seem rather straightforward, however, (a) The participants may solve the task piece by piece; that is, by first encoding a segment of a stimulus, switching to the other stimulus to make the comparison after a simple mental rotation, switching back to the first stimulus, encoding a new (or the same) segment, again switching to the other stimulus to compare the mentally rotated encoded segment and so forth until the evidence sufficient for a match/mismatch decision has been accumulated. If participants stick to this procedure then it follows that the last saccade switch on a trial should always be odd (1, 3, 5, etc.), never even. Furthermore, because re-encoding (following even numbered switches) would not entail mental rotation, only fixations following odd switches refiect the basic orientation alignment, (b) After the first switch (or generally any odd numbered switch), observers may happen to note characteristic features in the stimulus they currently study, and switch back and mentally rotate the image of these features in the opposite direction in order to test for a match. Experiments 1 and 2 were not designed to elucidate these hypotheses. Because the fine grained processing underlying the switches is uncertain, the estimates of the basic mental rotation velocity that is based on the duration of fixations in the second

1088

LARSEN

pass (see Table 5) may be too high, and should at any rate be viewed with some caution.

Is MT+A^5 the Neural Mechanism That Align the Orientation of Visual Images? In a meta-analysis of 32 investigations of brain activations during mental rotation Zacks (2008) found strong evidence for the hypothesis that the human motion area V5/MT-1- is strongly implicated in mental rotation. Zacks reported that all experiments using transformation specific contrasts (i.e., within-task comparisons of effects of mental rotation, e.g., comparing large rotations with small rotations) discovered activations about (-47.5, -59.5, -10.0, in Talairach space), which corresponds to the visual motion area (V5/MT-I-). It is interesting that studies on the visual psychophysics of motion and mental rotation seem to explain why V5/MT-I- is implicated in mental rotation. One group of studies documents that MAE interferes with mental rotation (see, e.g., Corballis & McLaren, 1982; Heil, Bajric, Rosier, & Henninghausen, 1997; Jolicoeur et al., 1998; Seurinck et al., 2011). Another group of investigations report the remarkably close functional relationships between online visual motion perception of stroboscopic stimuli that differ in orientation or size, and visual identiflcation of stimuli that portray the same objects in different orientation or in different size (see Bundesen, Larsen, & Farrell, 1981; Bundesen, Larsen, & Farrell, 1983; Farrell et al., 1982; Larsen, 1985; Larsen & Bundesen, 2009; R. N. Shepard & Judd, 1976). In the studies of visual apparent motion observers view two stimuli that differ in orientation, size, or both orientation and size. The stimuli are presented in sequential alternation (zero ISI) which, provided suitable timing of the stimuli, generate vivid impressions of revolving shape preserving motion, when the stimuli differ in orientation; impressions of an object that keep distal size while moving back and forth in visual space, when the stimuli differ in size; and impressions of screw-like helical motion in depth, when the stimuli differ with respect to orientation and size. The minimum SOA in which shape preserving motion breaks down increases as linear function of angular difference, and as a linear function of {s - l)l{s + 1 ) , where Í represent the size ratio^ between stimuli. When the stroboscopic stimuli differ with respect to orientation and size, SOA thresholds combine additively as a joint function of angular difference and {s - l)l{s + 1). These results for the SOA dependencies in visual motion have direct analogues in visual identification latencies of objects that differ in orientation, size, or both orientation and size (Bundesen, Larsen, & Farrell, 1981; Larsen, 1985; Sekuler & Nash, 1972). The remarkable functional similarity between visual perception and visual imagery notwithstanding, there is a puzzling difference in the magnitude of temporal effects between SOA thresholds in visual motion perception and RTs in visual object identiflcation. For example, R. N. Shepard and Judd (1976) observed that while the slope in apparent form preserving revolving motion was of the order of 60 ms per 60°, the corresponding slope in mental rotation was about 1,000 ms per 60°. R. N. Shepard and Judd argued that the difference probably related to mental rotation being "inner" driven and visual motion driven externally. However, the data in Table 5 (rows 2 and 3) point to a direct link at comparable time scales between visual motion perception and

mental rotation. As can been seen from Table 5 simple mental rotations are done with speed of about (30 or 40 ms per 60°), which fit pretty well with the reported findings for online perception of apparent rotational motion (Bundesen et al., 1983; Farrell et al., 1982; R. N. Shepard & Judd, 1976).

Perspectives Individual differences in mental rotation proficiency is an active research area and has been investigated in numerous studies. For example in children as a function age and gender (e.g., Jansen, Schmelter, Quaiser-Pohl, Neuburger, & Heil, 2013), as a function of training (e.g., Heil, Rosier, Link, & Bajric, 1998; Moreau, 2013), and in neurological and clinical syndromes (e.g., Fiorio, Tinazzi, & Aglioti, 2006; Rogers et al., 2002). The analysis and modeling of mental rotation in this article may be very useful in unraveling the nature of the effects on mental rotation in many of these studies. For instance, are effects of training due to a speed-up of the simple (or basic) mental rotation rate, or due to a reduction in saccade switches back and forth between stimulus objects, or a more efficient assembly of the triad of local components in the global module? Is the nature of the developmental change in the ability to identify congruent objects in different orientation related to the concomitant increase in VSTM capacity?

Concluding Remarks For each of 10 participants a random walk model accounts for the approximately linearly increasing RTs on positive trials, flat RTs on negative trials, and false alarms and miss rates as functions of angular difference. In addition the model also predicted effects of complexity and the number of eye movement switches between stimuli as functions of angular difference in orientation for the six participants in Experiment 2. The model assumes that a global module, comprising encoding one of the stimulus objects into VSTM, a simple mental rotation of the encoded image to fit the other object, and match of image and object, is repeated due to the limited VSTM capacity until the accumulated evidence for a response has been sampled. The number of repetitions to reach the evidence needed for reaching a decision is then treated as a random walk (Feller, 1970). The analysis of eye movements supports key aspects of the model by replacing free parameters in the random walk model with measures derived directly from partitioning RTs by the sequence of eye movements. The eye movement analysis shows that processing time is roughly a constant function of angular difference until the first saccade switch between stimulus objects is commenced, while the duration of the remaining trial increases approximately linearly as a function of angular discrepancy. This overall linear increase results from the additive effects of (a) a linear increase in the number, but not the duration of saccades between stimulus objects, (b) a linear increase in the number of saccades of approximately constant duration on a stimulus, and (c) * In general SOA is a linear function of {s — l)l{s + 1), where s represents the size ratio between stimulus objects. In special viewing conditions (see, Larsen & Bundesen, 2009), however, SOA is just a linear function of s — 1.

DECONSTRUCTING MENTAL ROTATION a linear increase in the number and in the duration of fixadons on a stimulus object. The slope constants for the duration of fixations on trials with simple and complex sdmuli were small (about 15 ms per 60°), but they do not seem different. The approximately constant duration of saccades as a funcdon of angular difference, the findings that mental rotadon may not be done (is suppressed) during saccades (Irwin & Brockmole, 2000; Irwin & Carlson-Radvansky, 1996), and the observadon that the duradon of fixations (in the second pass) increases approximately linearly as a function of angular difference, all support the hypothesis that the critical orientation alignment takes place during fixations. The small slope constant suggests that this basic orientadon alignment of a visual image (or parts thereof) take place at very high speed (perhaps about 15 ms per 60°), which together with converging evidence from brain imaging and visual psychophysics suggest that the alignment is done by mechanisms developed for online motion perception. In conclusion, deciding whether objects that appear in different orientation are idendcal, may be surprisingly time consuming, somedmes taking 4 or 5 s, occasionally even 10 or 15 s. The paper presents and tests aspects of a general computational framework that explains visual behavior in the classic mental rotadon task. The framework integrates (a) the underlying processes running on a millisecond time scale revealed by eye movements to (b) local processes measured on a time scale in hundreds of milliseconds, and at a higher level, (c) local processes to a global process measured in seconds.

References Alvarez, G. A., & Cavanagh, P. (2004). The capacity of visual short-term memory is set both by visual information load and by number of objects. Psychological Science, 15, 106-111. doi:10.1111/j.0963-7214.2004 .01502006.x Anderson, J. R. (1978). Arguments concerning representations for mental imagery. Psychological Review, 85, 249-277. doi:10.1037/0033-295X .85.4.249 Bethell-Eox, C. E., & Shepard, R. N. (1988). Mental rotation: Effects of stimulus complexity and familiarity. Joumal of Experimental Psychology: Human Perception and Performance, 14, 12-23. doi: 10.1037/ 0096-1523.14.1.12 Biederman, I. (1987). Recognition-by-components: A theory of human image understanding. Psychological Review, 94, 115-147. doi: 10.1037/ 0033-295X.94.2.115 Borst, G., Kievit, R. A., Thompson, W. L., & Kosslyn, S. M. (2011). Mental rotation is not easily cognitively penetrable. Journal of Cognitive Psychology, 23, 60-75. doi:10.1080/20445911.2011.454498 Bundesen, C. (1982). Item recognition with automatized performance. Scandinavian Journal of Psychology, 23, 173-192. doi: 10.1111 /j. 14679450.1982.tb00431.x Bundesen, C. (1990). A theory of visual attention. Psychological Review, 97, 523-547. doi:10.1037/0033-295X.97.4.523 Bundesen, C , & Larsen, A. (1975). Visual transformation of size. Joumal of Experimental Psychology: Human Perception and Performance, 1, 214-220. doi:10.1037/0096-1523.1.3.2I4 Bundesen, C , Larsen, A., & Earrell, J. E. (1981). Mental transformations of size and orientation. In J. Long & A. Baddeley (Eds.), Attention and performance IX (pp. 279-294). Hillsdale, NJ: Erlbaum. Bundesen, C , Larsen, A., & Earrell, J. E. (1983). Visual apparent movement: Transformations of size and orientation. Perception, 12, 549-558. doi:10.1068/pp.l20549

1089

Bundesen, C , Pedersen, L. F., & Larsen, A. (1984). Measuring efficiency of selection from briefly exposed visual displays; A model for partial report. Journal of Experimental Psychology: Human Perception and Performance, 10, 329-339. doi:10.1037/0096-1523.10.3.329 Carpenter, P. A., & Just, M. A. (1978). Eye fixations during mental rotation. In J. W. Senders, D. F. Fisher, & R. A. Monty (Eds.), Eye movements and the psychological functions (pp. 115-133). Hillsdale, NJ: Erlbaum. Cohen, D. J., & Kubovy, M. (1993). Mental rotation, mental representation, and flat slopes. Cognitive Psychology, 25, 351-382. doi:10.1006/ cogp. 1993.1009 Cooper, L. A. (1975). Mental rotation of random two-dimensional shapes. Cognitive Psychology, 7, 20-43. doi:!0.1016/0010-0285(75)90003-l Cooper, L. A. (1976). Demonstration of a mental analog of an external rotation. Perception & Psychophysics, 19, 296-302. doi: 10.3758/ BF03204234 Cooper, L. A., & Podgomy, P. (1976). Mental transformations and visual comparison processes. Journal of Experimental Psychology: Human Perception and Performance, 2, 503-514. doi:10.1037/0096-1523.2.4 .503 Cooper, L. A., & Shepard, R. N. (1973). Chronometrie studies of the rotation of mental images. In W. G. Chase (Ed.), Visual information processing (pp. 75-176). Oxford, England: Academic. Corballis, M. C , & McLaren, R. (1982). Interaction between perceived and imagined rotation. Journal of Experimental Psychology: Human Perception and Performance, 8, 215-224. doi:10.1037/0096-1523.8.2.215 Dahlstrom-Hakki, I., Pollatsek, A., Fisher, D. L., Miller, B., & Rayner, K. (2008). Eye movements and individual differences in mental rotation. In K. Rayner, D. Shen, X. Bai, & G. Yan (Eds.), Cognitive and cultural influences on eye movements (pp. 209-232). Hove, England; Psychology Press. Dodwell, P. C. (1970). Visual pattem recognition. New York, NY: Holt, Rinehart & Winston. Edelman, S. (1995). Class similarity and viewpoint invariance in the recognition of 3D objects. Biological Cybernetics, 72, 207-220. doi: 10.1007/BF00201485 Farrell, J. E., Larsen, A., & Bundesen, C. (1982). Velocity constraints on apparent rotational movement. Perception, 11, 541-546. doi:10.1068/pp .110541 Feller, W. (1970). An introduction to probability theory and its applications. New York, NY: Wiley. Fiodo, M., Tinazzi, M., & Aglioti, S. M. (2006). Selective impairment of hand mental rotation in patients with focal hand dystonia. Brain, 129, 47-54. doi:10.1093/brain/awh630 Folk, M. D., & Luce, R. D. (1987). Effects of stimulus complexity on mental rotation rate of polygons. Joumal of Experimental Psychology: Human Perception and Performance, 13, 395-404. doi: 10.1037/00961523.13.3.395 Förster, B., Gebhardt, R-P., Lindlar, K., Siemann, M., & Delius, J. D. (1996). Mental-rotation effect: A function of elementary stimulus discriminability? Perception, 25, 1301-1316. doi:10.1068/pp.251301 Furmanski, C. S., & Engel, S. A. (2000). Perceptual learning in object recognition: Object specificity and size invariance. Vision Research, 40, 473-484. doi: 10.1016/S0042-6989(99)00134-0 Gibson, E. J. (1969). Principles of perceptual teaming and development. East Norwalk, CT: Appleton-Century-Crofts. Graf, M. (2006). Coordinate transformations in object recognition. Psychological Bulletin, 132, 920-945. doi:10.1037/0033-2909.132.6.920 Hebb, D. O. (1949). The organization of behavior. New York, NY: Wiley. Heil, M., Bajric, J., Rosier, E., & Hennighausen, E. (1997). A rotation aftereffect changes both the speed and the preferred direction of mental rotation. Joumal of Experimental Psychology: Human Perception and Performance, 23, 681-692. doi:10.1037/0096-I523.23.3.681

1090

LARSEN

Heil, M., Rosier, F., Link, M., & Bajric, J. (1998). What is improved if a mental rotation task is repeated—The efficiency of memory access, or the speed of a transformation routine? Psychological Research, 61, 99-106. doi:10.I007/s004260050016 Hodgetts, C. J., Hahn, U., & Chater, N. (2009). Transformation and alignment in similarity. Cognition, 113, 62-79. doi: 10.1016/j.cognition .2009.07-010 Hyun, J.-S., & Luck, S. J. (2007). Visual working memory as the substrate for mental rotation. Psychonomic Bulletin & Review, 14, 154-158. doi: 1O-3758/BFO3194043 Irwin, D. (2004). Fixation location and fixation duration as indices of cognitive processing. In J. M- Henderson & F. Ferreira (Eds.), The interface of language, vision, and action: Eye movements and the visual world (pp. 105-134). New York, NY: Psychology Press. Irwin, D. E., & Brockmole, J. R. (2000). Mental rotation is suppressed during saccadic eye movements. Psychonomic Bulletin & Review, 7, 654-661. doi:10.3758/BF03213003 Irwin, D. E., & Carlson-Radvansky, L. A. (1996). Cognitive suppression during saccadic eye movements. Psychological Science, 7, 83-88. doi: 10.ini/j.l467-9280.1996.tbO0334.x Jansen, P., Schmelter, A., Quaiser-Pohl, C , Neuburger, S., & Heil, M. (2013). Mental rotation performance in primary school age children: Are there gender differences in Chronometrie tests? Cognitive Development, 28, 51-62. doi:10-1016/j.cogdev.2012.08.005 Jolicoeur, P., Corballis, M. C , & Lawson, R. (1998. The influence of perceived rotary motion on the recognition of rotated objects. Psychonomic Bulletin & Review, 5, 140-146. doi:10.3758/BF03209470 Jonikaitis, D., Deubel, H., de'Sperati, C. (2009). Time gaps in mental imagery introduced by competing saccadic tasks. Vision Research, 49, 2164-2175. doi:10.1016/j.visres.2009.05-021 Just, M. A., & Carpenter, P. A. (1976). Eye fixations and cognitive processes. Cognitive Psychology, 8, 441-480. doi:10.1016/00100285(76)90015-3 Köhler, W. (1929). Gestalt psychology. Oxford, England: Liveright. Kosslyn, S. M. (1973). Scanning visual images: Some structural implica-

tions. Perception & Psychophysics, 14, 90-94. doi: 10.3758/ BF03198621 Kosslyn, S. M. (1980). Image and mind. Cambridge, MA: Harvard University Press. Larsen, A. (1985). Pattern matching: Effects of size ratio, angular difference in orientation, and familiarity. Perception and Psychophysics, 38, 63-68. doi:10.3758/BF03202925 Larsen, A., & Bundesen, C. (1998). Effects of spatial separation in visual pattern matching: Evidence on the role of mental translation. Journal of Experimental Psychology: Human Perception and Performance, 24, 719-731. doi:10.1037/0096-1523.24.3.719 Larsen, A., & Bundesen, C. (2009). Common mechanisms in apparent motion perception and visual pattern matching. Scandinavian Journal of Psychology, 50, 526-534. doi:10.1111/j.l467-9450.2009.00782.x Larsen, A., Mcllhagga, W., & Bundesen, C. (1999). Visual pattern matching: Effects of size ratio, complexity, and similarity in visual pattern matching. Psychological Research/Psychologische Forschung, 62, 280288. doi:10.1007/s004260050058 Lashley, K. S. (1942). The problem of cerebral organization in vision. In H. Klüver (Ed.), Biologicai symposia: Visual mechanisms (pp. 301322). Lancaster, PA: Cattell Press. Liesefeld, H. R., & Zimmer, H. D. (2013). Think spatial: The representation in mental rotation is nonvisual. Journal of Experimental Psychology: Learning, Memory, and Cognition, 39, 167-182. doi: 10.1037/ a0028904 Luck, S. J., & Vogel, E. K. (1997). The capacity of visual working memory for features and conjunctions. Nature, 390, 279-281. doi: 10.1038/36846

Mach, E. (1902). Die Analyse der Empfindungen und das Verhältnis des Physischen um Psychischen [The Analysis of Sensations and the Relation of the Physical to the Psychical] (3rd ed.). Jena, Germany: Gustav Fischer. Metzler, J., & Shepard, R. N. (1974). Transformational studies of the internal representation of three-dimensional objects. In R. L. Solso (Ed.), Theories in cognitive psychology: The Loyoia Symposium (pp. 146201). Oxford, England: Eribaum. Moreau, D. (2013). Differentiating two- from three-dimensional mental rotation training effects. Quarterly Journal of Experimental Psychology, 66, 1399-1413. doi:10.1080/17470218.2012.744761 Nakatani, C , & Pollatsek, A. (2004). An eye movement analysis of "mental rotation" of simple scenes. Perception & Psychophysics, 66, 1227-1245. doi:10.3758/BF03196848 Pashler, H. (1988). Familiarity and visual change detection. Perception & Pschophysics, 44, 369-378. doi:10.3758/BF03210419 Phillips, W. A. (1974). On the distinction between sensory storage and short-term visual memory. Perception and Psychophysics, 16, 283-290. doi:10.3758/BF03203943 Pitts, W., & McCuUoch, W. S. (1947). How we know universal: The perception of auditory and visual forms. Bulletin of Mathematical Biophysics, 9, 127-147. doi:10.1007/BF02478291 Prime, D. J., & Jolicoeur, P. (2010). Mental rotation requires visual short-term memory: Evidence from human electric cortical activity. Journal of Cognitive Neuroscience, 22, lATil-lAAd. doi:10.1162/jocn .2009.21337 Pylyshyn, Z. W. (1973). What the mind's eye tells the mind's brain: A critique of mental imagery. Psychological Bulletin, 80, 1-24. doi: 10.1037/hO034650 Pylyshyn, Z. W. (1979). The rate of mental rotation of images: A test of a holistic analogue hypothesis. Memory & Cognition, 7, 19-28. doi: 10.3758/BF03196930 Pylyshyn, Z. (2003). Return of the mental image: "Are there really pictures in the brain?" Trends in Cognitive Sciences, 7, 113-118. doi: 10.1016/S1364-6613(03)00003-2 Rock, I. (1956). The orientation of forms on the retina and in the environment. The American Journal of Psychology, 69, 513-528. doi:10.2307/ 1419077 Rogers, M. A., Bradshaw, J. L., Phillips, J. G., Chiu, E., Mileshkin, C , & Vaddadi, K. (2002). Mental rotation in unipolar major depression. Journal of Clinical and Experimental Neuropsychology, 24, 101-106. doi: 10.1076/jcen.24.1.101.974 Searle, J. A., & Hamm, J. P. (2012). Individual differences in the mixture ratio of rotation and nonrotation trials during rotated mirror/normal letter discriminations. Memory & Cognition, 40, 594-613. doi:10.3758/ S13421-011-0172-2 Sekuler, R., & Nash, D. (1972). Speed of size scaling in human vision. Psychonomic Science, 27, 93-94. doi:10.3758/BF03328898 Selfridge, O. G. (1959). Pandemonium: A paradigm for learning. In Mechanisation of thought processes (pp. 511-526). London, England: Her Majesty's Stationery Office. Seurinck, R., de Lange, F. P., Achten, E., & Vingerhoets, G. (2011). Mental rotation meets the motion aftereffect: The role of hV5/MT in visual mental imagery. Journal of Cognitive Neuroscience, 23, 13951404. doi:10.1162/jocn.2010.21525 Shepard, R. N., & Cooper, L. A. (Eds.). (1982). Mentai images and their transformations. Cambridge, MA: MIT Press. Shepard, R. N., & Judd, S. A. (1976). Perceptual illusion of rotation of three-dimensional objects. Science, 191, 952-954. doi:10.1126/science .1251207 Shepard, R. N., & Metzler, J. (1971). Mental rotation of three-dimensional objects. Science, 171, 701-703. doi:10.1126/science.l71.3972.701 Shepard, S., & Metzler, D. (1988). Mental rotation: Effects of dimensionality of objects and type of task Journal of Experimental Psychology:

DECONSTRUCTING MENTAL ROTATION Human Perception and Performance, 14, 3-11. doi:10.1037/0096-1523 .14.1.3 Shibuya, H., & Bundesen, C. (1988). Visual selection from multi-element displays: Measuring and modeling effects of exposure duration. Journal of Experimental Psychology: Human Perception and Performance, 14, 591-600. doi:10.1037/0096-1523.14.4.591 S0rensen, T. A., & Kyllingsbsk, S. (2012). Short-term storage capacity for visual objects depends on expertise. Acta Psychologica, 140, 158-163. doi:10.1016/j.actpsy.2012.04.004 Sperling, G. (1960). The information available in brief visual presentations. Psychological Monographs: General and Applied, 74, 1-29. doi: 10.1037/h0093759 Sutherland, N. S. (1968). Outlines of a theory of visual pattem recognition in animals and man. Proceedings of the Royal Society of London: Series B, 171, 297-317. doi:10.1098/rspb.l968.0072 Takano, Y. (1989). Perception of rotated forms: A theory of information types. Cognitive Psychology, 21, 1-59. doi:10.1016/00100285(89)90002-9 Tarr, M. J., & Gauthier, I. (1998). Do viewpoint-dependent mechanisms generalize across members of a class? Cognition, 67, 73-110. doi: 10.10I6/S0010-0277(98)00023-7

1091

Todd, J. J., & Marois, R. (2004). Capacity limit of visual short-term memory in human posterior parietal cortex. Nature, 428, 751-754. doi:10.1038/nature02466 Vogel, E. K., & Machizawa, M. G. (2004). Neural activity predicts individual differences in visual working memory capacity. Nature, 428, 748-751. doi:I0.1038/nature02447 Wheeler, M. E., & Treisman, A. M. (2002). Binding in short-term visual memory. Journal of Experimental Psychology: General, 131, 48-64. doi:10.1037/0096-3445.131.l.48 Yuille, J. C , & Steiger, J. H. (1982). Nonholistic processing in mental rotation: Some suggestive evidence. Perception & Psychophysics, 31, 201-209. doi:10.3758/BF03202524 Zacks, J. M. (2008). Neuroimaging studies of mental rotation: A metaanalysis and review. Journal of Cognitive Neuroscience, 20, 1-19. doi:10.1162/jocn.2008.20013

Received June 25, 2013 Revision received December 9, 2013 Accepted December 9, 2013

Correction to Larsen (2014) In the article "Deconstructing Mental Rotation" by Axel Larsen {Journal of Experimental Psychology: Human Perception and Performance, Advance online publication. February 10, 2014. doi; 10.1037/a0035648), the value of the fixation duration increment, about 15 ms per 60° on simple and complex trials is erroneous. The correct value for the increment is about 7 ms per 60°. The error is found four times in the text; In the second line from the bottom of the "Abstract," in the seventh line in the section "Integrating Local and Global Analyses of Mental Rotation," and in the section "Concluding Remarks" 14th and 25th line from the bottom of that section. DOI: 10.1037/a0036625

Copyright of Journal of Experimental Psychology. Human Perception & Performance is the property of American Psychological Association and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.

Deconstructing mental rotation.

A random walk model of the classical mental rotation task is explored in two experiments. By assuming that a mental rotation is repeated until suffici...
16MB Sizes 1 Downloads 0 Views