Vision Res. Vol. 32, No. 8, pp. 1513-1521, 1992 Printed in Great Britain. All rights reserved

Copyright

The Role of Binocular a Kinematic Analysis PHILIP

SERVOS,*

MELVYN

A. GOODALE,*?

0042-6989/92 $5.00 + 0.00 cI 1992 Pergamon Press Ltd

Vision in Prehension:

LORNA

S. JAKOBSON*

Received 8 November 1991

This study examined the contribution of binocular vision to the control of human prehension. Subjects reached out and grasped oblong blocks under conditions of either monocular or binocular vision. Kinematic analyses revealed that prehensile movements made under monocular viewing differed substantially from those performed under binocular conditions. In particular, grasping movements made under monocular viewing conditions showed longer movement times, lower peak velocities, proportionately longer deceleration phases, and smaller grip apertures than movements made under binocular viewing. In short, subjects appeared to be underestimating the distance of objects (and as a consequence, their size) under monocular viewing. It is argued that the differences in performance between the two viewing conditions were largely a reflection of differences in estimates of the target’s size and distance obtained prior to movement onset. This study provides the first clear kinematic evidence that binocular vision (stereopsis and possibly vergence) makes a significant contribution to the accurate programming of prehensile movements in humans. Humans Prehension Visuomotor behavior

Monocular

Binocular

Limb movements

INTRODUCTION The study of depth vision in humans has concentrated almost entirely on perceptual judgments about the visual world, and has largely ignored the role of depth cues in the programming and execution of skilled motor behavior. Moreover, most of these studies have focussed on estimates of the relative depth of objects as opposed to their actual distance from an observer. Yet many everyday actions, such as reaching out and picking up an object, require precisely the latter type of estimate. For example, perceiving that a coffee cup is closer to you than a box of cornflakes will be of limited use in planning the movements required to pick up that cup. What is needed here is an accurate estimate of the actual distance of the cup so that an efficient reaching movement can be executed without constant monitoring of the relative distances of the hand, cup and cereal box. What then are the visual cues that might form the basis of such estimates? In principle, a number of cues could be used, particularly with familiar objects. With respect to prehensile movements directed toward unfamiliar stationary targets, however, there are four strong candidates: (1) motion parallax, (2) accommodation, (3) vergence movements, and/or (4) stereopsis (in conjunction with binocular vertical disparities or perhaps vergence information). Although monocular cues such as

*Department

Ontario, tTo whom

of Psychology, University of Western Ontario, Canada N6A 5C2. all correspondence should be addressed.

London,

Distance estimation

Visual feedback

accommodation (Biersdorf, Ohwaki & Kozil, 1963; Fisher & Ciuffreda, 1988; but see Morrison & Whiteside. 1984) and motion parallax (Ferris, 1972; Gogel, 1982) could clearly provide some distance information, it is commonly believed that binocular cues are the most important source of absolute distance information (Bishop, 1989; Foley, 1980). But while several authors have suggested that such cues might play a critical role in the programming and execution of prehension in primates, including humans (Previc, 1990; Sheedy, Bailey, Buri & Bass, 1986), there have been almost no systematic investigations of the role of binocular vision in the control of this important skill. The purpose of the present study, then, was to examine the effect of removing this important source of distance information on the kinematics of normal reaching and grasping movements in humans. What evidence is there that binocular vision can provide estimates of distance that are accurate and reliable enough for the programming of prehensile movements? The perceptual literature would appear to suggest that convergence by itself is unable to provide the accurate distance estimates that are implicit in most acts of prehension. Thus, while observers are able to estimate the absolute distance of objects on the basis of convergence alone, their trial-to-trial performance is rather variable (Irving & Ludvigh, 1936; Heineman, Tulving & Nachmias, 1959; Gogel, 1961; Ogle, 1962; Foley & Held, 1972; Morrison & Whiteside, 1984). These studies are consistent with the finding that humans have great difficulty estimating the degree to which their eyes

1513

arc converged (Hill, 1972). At the level of perceptual report then. convergence appears to be a poor candidate for the generation of absolute distance estimates. It should be emphasized, however, that nearly all of these studies have relied on some sort of explicit report or cognitive judgment (i.e. they have depended on the subject’s conscious perception of distance). Subjects have not been asked to produce a motor output, such as a manual aiming movement, where the distance estimate is implicit in the act itself rather than explicitly required. It is entirely possible therefore that binocular cues such as convergence might provide accurate distance information for the control of motor output even though this information is not available for conscious perceptual report. Dissociations between perceptual report and visuomotor control have certainly been observed in a number of different paradigms where the location of a visual target has been manipulated (Bridgeman, Lewis. Heit & Nagle. 1979; Goodale. Pelisson & Prablanc. recent neuropsychological evidence 1986). Indeed, suggests that the neural substrates for perception and associated cognitive judgments may be quite independent of those underlying the visual control of skilled movements of the hand and limb (Goodale, Milner. Jakobson & Carey, 1991; Milner & Goodale. 1992). Paradoxically, absolute distance information must be implicitly available to permit the occurrence of certain perceptual phenomena such as stereoscopic depth constancy (Ono & Comerford, 1977). Vergencc is one of two sources of information that could provide the necessary absolute distance information for stereoscopic depth constancy (Foley. 1980); the other is vertical binocular disparities (Longuet-Higgins, 1982; Bishop, 1989; although see Cumming, Johnston & Parker, 1991). In contributing to the required computations, both of these mechanisms would appear to operate at a level that is most likely inaccessible to perceptual report. Moreover, both mechanisms function optimally at egocentric distances of up to approx. I m-viz. a region generally corresponding to prehension space. Thus, as was indicated earlier, either of these mechanisms, or both. could be used in the implicit computations of absolute distance required by the sensorimotor systems supporting prehension. Nevertheless, vergence and vertical disparities will generate absolute distance estimates only when the objects lie in the fixation plane. To compute the absolute distance of objects lying on either side of the horopter. horizontal disparities (stereopsis and even diplopic images) must also form part of the equation. In short, a constellation of binocular cues (e.g. vergence, vertical disparities, and stereopsis) could theoretically provide the information required for the programming of accurate prehensile movements of the hand and limb. There have been only a few attempts to investigate the role of such cues in the control of prehension. Moreover, such attempts have relied on rather indirect measures of performance, such as time to complete a task and accuracy, and have not looked at the kinematics of the actual movements produced under the different viewing

conditions. Moreover. the role of ;rhsolute dtstancc estimation in the performance of these [asks has been largely ignored. Sheedy et ul. (1986), for example, compared the performance of subjects on sel’eral manual tasks (e.g. threading beads onto a string) under monocular versus binocular viewing conditions. It was found. In general, that performance was best in the binocular condition. Tasks like threading beads, however, do not require the computation of absolute distance. While one task used by Sheedy et ~1. ( 1986). tossing bean bags at targets, presumably did involve some estimation of distance, only sketchy information about subjects’ performance was provided. Furthermore. the kinematics of the constituent movements were never examined in any of these tasks. The present study not only examined the kinematics of prehension under monocular vs binocular viewing conditions but also varied the size and distance of the objects that the subjects were required to pick up. Moreover, a viewing environment was selected which afforded a rich array of monocular and binocular depth and distance cues, an array similar to that available in everyday life. It was reasoned that if kinematic measures of movements made under binocular viewing differed significantly from those made under monocular viewing then this would provide strong support for the argument that binocular distance cues are critical to the guidance of manual prehension. What kinds of kinematic measures would be expected to change as a function of viewing condition’! Evidence from a number of anatomical, neurological, and developmental studies suggests that visually guided prehension consists of two relatively independent, but temporally-coupled, components (for review, see Jeannerod, 1988). One of these components is the reach itself in which the hand is transported to the location of the target object. The second component is the grasp, in which the posture of the hand and fingers is adjusted to reflect the size, shape and orientation of the object well before contact is made. A number of studies have shown that the peak velocity of the reach and several other transport kinematics vary as a function of object distance, whereas the grasp itself varies primarily as a function of the size of the target object (Jeannerod, 1988). The calibration of the grasp, of course, also depends on estimates of distance (Jakobson & Goodale, 1991), particularly with unfamiliar objects where object distance must be combined with the size of the subtended retinal image to compute object size. Thus. in the present experiment, where object size and distance were varied randomly from trial to trial, it was anticipated that the removal of binocular distance cues would interfere with the calibration of both the transport and the grasp components of manual prehension. METHOD

Subjects Nine undergraduates with normal or corrected-tonormal vision participated for pay (five males and four

BINOCULAR

VISION AND PREHENSION

females, mean age = 22.6 yr). All subjects were strong righthanders as determined by a modified version of the Edinburgh Handedness Inventory (Oldfield, 1971). Six subjects were right-eye dominant while the remaining three subjects were left-eye dominant. All subjects had stereoscopic vision in the normal range with assessed stereoacuities of 40” of arc or better as determined by the Randot Stereotest (Stereo Optical Co., Chicago, Ill.). Apparatus

Subjects sat at a table, 100 cm wide and 55 cm deep. The surface of the table was painted flat black. A circular 1 cm dia microswitch button located 15 cm from the subject functioned as the start position for each reaching movement. This button was located directly at the body midline. A circular fluorescent lamp was suspended approx. 80 cm above the table surface. This lamp, in which the condenser was pre-activated, could be illuminated by the experimenter from a remote switch which also triggered the start of data collection. Full illumination was achieved within 80 msec. Three red, oblong wooden blocks with the following top surface dimensions were used: 2 x 5, 3 x 7.5, 5 x 12.5 cm. All of the objects were 2cm high. The underside of each of the objects contained an embedded magnet, and could be positioned so as to make contact with one of three magnetic switches located under the table surface at distances of 20, 30 or 40 cm from the microswitch, along the midline. Upon picking up an object the contact between these two magnets was broken, signaling the end of collection for a given trial. Three 4 mm dia i.r. light-emitting diodes (IREDs) were attached with small pieces of cloth tape to the head of the radius at the wrist, the distal portion of the right border of the thumbnail, and the distal portion of the left border of the index fingernail. The tape permitted complete freedom of movement of the hand and fingers. The three IREDs were monitored by two high-resolution cameras positioned appox. 2 m from the subject. The instantaneous positions of the IREDs were digitized at a rate of 100 Hz into two-dimensional coordinates and then passed on to the data collection system of a WATSMART computer (Waterloo Spatial Motion Analysis and Recording Technique, manufactured by Northern Digital Inc., Waterloo, Ontario). Procedure

Subjects were instructed at the beginning of each session to make quick, accurate, and natural reaches with their right hand, picking up each object with their thumb and index finger along the long axis of the object, which was always perpendicular to the body midline. They were instructed to pick up the block as soon as the overhead light was illuminated and the block became visible. Subjects were told that prior to the start of a given trial they were to place the tips of the index finger and thumb of their right hand on the start button. For approximately a 5 set period before a given trial (i.e. as

1515

soon as the overhead fluorescent light was extinguished from a previous trial), subjects sat in the dark with their eyes closed. Once a block had been placed in a given position by the experimenter, subjects were given a ready signal which prompted them to open their eyes and to anticipate the illumination of the overhead light approx. l-2 set later. Two testing sessions were administered, each spaced at least 24 hr apart. The first session consisted of a handedness questionnaire, a test for eye dominance, a stereoacuity test, and a block of 72 experimental trials. five subjects were tested first under binocular viewing conditions while the remaining four subjects were first tested monocularly using their dominant eye (the nondominant eye was patched). The second session consisted of 72 trials either under monocular or binocular viewing conditions followed by 2 counterbalanced blocks (one monocular, the other binocular) of 27 trials of a simple reaction-time (RT) task. The experimental set-up for the RT task was identical to that used in the prehension conditions except that instead of picking up an object, subjects simply lifted their thumb and index finger off of the start key as quickly as possible when a block became visible. Subjects were explicitly instructed not to reach towards the blocks. RT was monitored by the release of the start key, On each test day, the 72 reaching trials consisted of 8 instances of each of the 9 possible distance x object size combinations. Trial presentations were random except for the stipulation that no more than 3 consecutive, identical trials were allowed. Also, in order to look at possible practice effects, each block of 72 trials consisted of two blocks of 36 trials (4 instances of each distance x object size combination). The simple RT task consisted of 3 instances of each of the 9 possible distance x object size combinations. All prehension conditions were preceded by a series of 5 practice trials while the simple RT conditions were preceded by a series of 3 such trials. Any trials in which the subject dropped an object were repeated at the end of a given block. Such occurrences were rare. Each testing session lasted approx. 90min. Accuracy of system

Calibration of the WATSMART system involved placing in the experimental workspace a rigid frame to which were attached 24 IREDs at known locations. The WATSMART calibration software calculated the threedimensional root-mean-square error of reconstruction for the locations of a minimum of 22 IREDs to be < 2 mm. A procedure similar in principle to that described by Haggard and Wing (1990) was used to provide an independent assessment of the system’s accuracy. Three IREDs were embedded in a rigid surface to form the vertices of a right-angled triangle measuring approx. 10 x 15 x 18 cm. The “triangle” was positioned adjacent to the start key of the experimental apparatus and in another trial it was placed along the midline, approx. 30 cm beyond this poistion in the x (forward-going) dimension. The three-dimensional coordinates of

PHILIP

Distance

width

= 3 cm

600

z E r c

400

:: P

01 trl.

breaking the magnetic switch); (3) maximum grip aperture (the maximum vectored distance between the thumb and index finger IREDs); (4) peak resultant velocity and (5) the time at which it occurred following movement onset; (6) peak acceleration in the x (forward/backward) dimension and (7) the time at which it occurred following movement onset; (8) peak deceleration in the .V (forward/backward) dimension and (9) the time at which it occurred following movement onset. Measures (4) (9) were based upon data from the wrist IRED.

=3ocm

Object

;;

SERVOS

200

RESULTS 0

I

I

I

0

500

Time (msec

FIGURE subject-one

I

I 1000

1500

I 2000

I

I. Velocity profiles for two reaches made reach was made under normal binocular other under monocular control.

by the same control, the

the static IREDs were sampled for 2 set in each location at a sampling frequency of 100 Hz. Comparisons of the average distance between any two given IREDs in both regions of the workspace were quite consistent, with differences ranging from 0.93 to 2.18 mm. The standard deviations of these measurements within each of the 2 set sampling periods varied from 0.29 to 1.10 mm. Data processing

The stored sets of two-dimensional coordinates were converted into three-dimensional coordinates off-line and filtered (a second-order Butterworth filter with a 7 Hz cut-off). The IREDS on the index finger and thumb provided information about the grip portion of the reach while all other kinematic variables were based on information from the wrist IRED. Dependent measures

Nine kinematic measures were computed from the three-dimensional coordinates corresponding to a given prehensile movement. These were: (1) time to movement onset (measured as the time for the thumb and index finger to release the mechanical start key); (2) movement duration (calculated by subtracting the movement onset time from the time at which an object was lifted, TABLE

I. Summary

For each of the nine subjects, mean values of each of the dependent variables were calculated across a minimum of 6 observations for each size x distance combination in each viewing condition. (Equipment failure resulted in some loss of data, but this constituted < 1% of the trials.) The mean values were entered into separate 2 x 3 x 3 x 2 (viewing condition x object size x object distance x practice) repeated measures analyses of variance. (The factor Practice, refers to a comparison between the first and last 36 trials of a given condition. There was no significant main effect of Practice for any kinematic variable. Nor were there any interpretable interactions with any other factor.) Degrees of freedom were corrected according to the Huynh-Feldt adjustment (Huynh & Feldt, 1976). All tests of significance were based upon an alpha level of 0.05. The effects qf aiewing condition on the transport component Summary of mainfindings. Figure 1 shows the velocity

profiles of two individual trials, one made under binocular viewing conditions, the other under monocular. Many of the differences in the transport component that became apparent under analysis of variance are illustrated in this figure. Under monocular vision, the latency to begin the movement and the movement duration were longer than under binocular vision. In addition, the peak velocity and acceleration of the reach under monocular vision were reduced relative to binocular vision. Finally, the time spent decelerating was longer under monocular viewing, particularly in the period of low velocity movement at the very end of the reach. (Means and tests of significance for each of these measures are summarized in Tables 1 and 2.)

table of effect of viewing condition on various values indicated in parentheses) Viewing

kinematic

variables

(SEM

condition

._ Kinematic

variable

Movement onset (simple) (msec) Movement onset (msec) Movement duration (msec) Peak velocity (mm/xc) Peak acceleration (dm/se?) Time to peak velocity (msec) Time to peak acceleration (msec) Maximum grip aperture (mm)

Monocular

Binocular

615 578 838 905 49 255 I16 84

500 (10.9) 496 (3.0) 611 (13.1) 1099 (23.7) 67 (I .7) 221 (3.7) 104 (3.4) 90(1.0)

(8.9) (3.4) (13.7) (15.6) (1.0) (3.0) (2.6) (1.1)

F statistic F,,,,, = 5.14. P < 0.05 F,,,,, = 87.25. P < 0.001 F,,,,,=21.40, P

The role of binocular vision in prehension: a kinematic analysis.

This study examined the contribution of binocular vision to the control of human prehension. Subjects reached out and grasped oblong blocks under cond...
1MB Sizes 0 Downloads 0 Views