HUMAN FACTORS, 1992,34(6),655-667

Factors that Affect Depth Perception in Stereoscopic Displays ROBERT PATTERSON, t Department ofPsychology, Washington State University, Pullman, Washington, and LINDA MOE and TIGER HEWITT, Department ofPsychology, Montana State University, Bozeman, Montana

This study investigated several factors that affect depth perception in stereoscopic displays: half-image separation magnitude, separation direction (crossed vs. uncrossed), viewing distance, stimulus size, and exposure duration. The depth perceived under various combinations of levels of these factors was compared with depth predicted by the geometry of stereopsis. Perceived depth in the crossedseparation direction was frequently close to predictions, such that increases in separation and viewing distance produced appropriate increases in perceived depth. Depth in the uncrossed direction was frequently less than that predicted, especially for small stimuli presented at a long viewing distance, with a large half-image separation, and/or with a brief duration. Thus depth in both crossed and uncrossed directions equaled predictions only for large stimuli exposed for a long duration.

INTRODUCTION There is a growing interest in government and industry for the development of visual displays that represent the depth of objects stereoscopically-that is, by presenting the two eyes' views with retinal disparity (interocular differences in the position of corresponding monocular images). Stereoscopic displays produce an enhanced perception of relative depth among objects that can improve the perceptual discrimination of figure from ground (Yeh and Silverstein, 1990). As One example, the Armstrong Laboratory at Wright-Patterson Air Force Base has devel1 Requests for reprints should be sent to Robert Palterson, Department of Psychology, Washington State University, Pullman, WA99164-4820.

oped a system (the Super Cockpit Program) in which binocular helmet-mounted displays provide pilots with stereoscopic views of terrain, objects, and system information (Furness, 1986). Other applications of stereopsis, such as telerobotics and meteorology, are discussed in an excellent review by Wickens, Todd, and Seidler (1989). One fundamental concern involving stereoscopic depth perception is whether the depth perceived is "correct" or valid. This concern arises because disparity by itself is an ambiguous depth cue: a single disparity value will yield different magnitudes of depth, depending on viewing distance. Thus the visual system must rescale disparity information for different perceived distances to produce correct depth perception. In general terms, the

iC 1992, The Human Factors Society, Inc. All rights reserved.

656-December 1992

valid perception of a distal stimulus (depth) involves a perceptual rescaling of proximal stimulation (disparity) in accordance with changes in perceived viewing distance. The perception of valid depth despite changes in distance is called depth constancy (Cormack and Fox, 1985; Ono and Comerford, 1977; Wallach and Zuckerman, 1963). In stereoscopic displays, perception is valid if perceived depth corresponds to that predicted by the geometry of binocular viewing of such displays. According to that geometry, the magnitude of predicted depth is given by d = S x D I (I + S) for crossed half-image separations (i.e., crossed disparity or depth in front of fixation) and d = S x D I (I - S) for uncrossed separations (uncrossed disparity or depth behind fixation), where d = predicted depth interval between object and fixation plane, S = separation between monocular half-images of the disparate stimulus, D = viewing distance from observer to fixation, and I = interpupillary distance (Cormack and Fox, 1985; note that disparity is derived from the ratio of separation to viewing distance. Thus crossed disparity is given by the ratio of crossed separation to distance, and uncrossed disparity is given by the ratio of uncrossed separation to distance). -For both crossed and uncrossed directions, perceived depth should increase with half-image separation and with distance, with a greater increase occurring in the uncrossed direction. Depth should decrease with increases in interpupillary distance. These predictions derive from geometry, and it remains an empirical question as to whether depth is perceived as predicted. We addressed this question by attempting to determine some of the conditions under which perceived depth in stereoscopic displays is valid-that is, the conditions under which perceived depth equals geometric predictions. Depending on the particular application, the question of whether perceived

HUMAN

FACTORS

depth is valid could be very important. If depth is invalid in a given display system, then certain spatial relations among objects would be distorted, with their depth positions appearing closer or farther than they should. This in turn could be potentially dangerous: for example, in the case of a fighter pilot misperceiving the depth and distance of oncoming enemy aircraft as represented in the kind of display employed at Wrig'ltPatterson Air Force Base. In two experiments, we investigated three geometric factors: half-image separation magnitude, separation direction (crossed vs. uncrossed), and viewing distance. We wondered whether or not these variables would affect perceived depth; if so, would they affect depth in ways consistent with geometry? A 'comparison of depth perception in the crossed and uncrossed directions would be especially interesting because much psychophysical and neurophysiological research (for review, see Mustillo, 1985) indicates that the processing of disparity or separation infor_ mation in the two directions is mediated by separate visual mechanisms. For example,' Held, Birch, and Gwiazda (1980) reported that human infants show better stereoacuity when tested with crossed disparities than when tested with uncrOssed disparities; sensitivity to crossed disparity develops at an earlier age. Richards (e.g., 1970) showed that approximately 30% of adult observers from large samples are insen_ sitive to disparity of only one direction crossed or uncrossed, an insensitivity called stereoanomaly. Its selective nature suggests separate mechanisms mediate depth percep_ tion in the two directions. (Note that Patter_ son and Fox, 1984, showed that the classlg, cation of stereoanomaly depends on the use of brief stimulus exposures.) Neurophysiolopj, cal studies (LeVay and Voigt, 1988; Maunsell and Van Essen, 1983; Poggio and Fischer 1977; Poggio, Motter, Squatrito, and Trotter',

December 1992-657

DEPTH PERCEPTION

1985; Poggio and Poggio, 1984) have revealed the existence of several types of disparityactivated neurons in cat and monkey cortex. The responses of two types-the "near" and "far" neurons-may provide the substrate for crossed and uncrossed stereopsis, respectively. Such neurons are activated by disparity of one direction and inhibited by disparity of the opposite direction. Given that crossed and uncrossed disparity are probably processed by separate mechanisms, the effects of half-image separation magnitude and/or viewing distance may differ between the two directions. We wanted our results to be applicable to stereoscopic displays employed in a variety of situations in government and industry, not to only one specific application or system. In other words, we wanted to investigate fundamental properties of stereoscopic depth perception that would apply to many different kinds of display systems that use disparity as a means of conveying apparent depth. We therefore thought it important to focus our investigation solely on stereopsis, not to examine multiple depth cues or their interaction, because the pattern of such interaction would probably depend on the specific situation within which it occurred. To meet these goals, we employed stereoscopic stimuli created from dynamic random-element stereograms (Ju1esz, 1960, 1971). Arandom-element stereogram is a pair of arrays of randomly ordered elements (e.g., 5000 dots); one array is presented to each eye of an observer. A stimulus defined by disparity is created by shifting laterally a subset of elements in one eye's view and leaving unshifted corresponding elements in the other eye's view, a shift camouflaged by background elements. An individual with stereopsis can perceive the stimulus as a form positioned in depth either in front of or behind background (the stimulus can be seen neither monocularly nor by someone without stere-

opsis). The advantage of random-element stereograms is that they are devoid of nonstereoscopic depth cues, so they isolate and test only visual mechanisms that are sensitive to disparity (Fox, 1985). METHODS Observers

Eleven observers served in Experiment 1, and 17 observers served in Experiment 2. The observers were male and female college students recruited from an introductory psychology class (they received credit for participation). All had normal or correctedto-normal acuity in both eyes (Snellen acuity of 20/33 or better) as determined by, the Ortho-Rater test (manufactured by Bausch and Lomb) and good binocular vision as determined by performance on a test of crossed and uncrossed stereopsis involving the resolution of various geometric shapes embedded in static random-dot stereograms (Julesz, 1971). The observers were naive with respect to the;. hypotheses under consideration. Apparatus

. The dynamic random-element stereogram generation system (Fox and Patterson, 1981; Shetty, Brodersen, and Fox, 1979) consisted of the following two parts: (1) The display device was a 13-inch color monitor (NEC Multisync II), the red and green guns of which were electronically controlled by the stereogram generator. In Experiment 1 the angular size of the display screen was 21:0 .x 16.0 deg (viewing distance of 75 em) or 10.5 x 8.0 deg (viewing distance of 150 em), In Experiment 2 the size of the screen was 10.5 x 8.0 deg (viewing distance of 150 em). The electronic control of the red and green guns .produced random-dot matrices composed of red and green dots on the display screen. Stereoscopic viewing was accomplished by the anaglyph method; red and green filters were placed

658-December 1992

before the eyes of the observer. (2) The stereogram generator was a hard-wired device that generated random dots, created disparity (producing a stereoscopic stimulus), and specified the x,y coordinates of the stimulus. All dots were replaced aynamically, with positions assigned randomly, at a rate of 60 Hz, which allowed the stimulus to be briefly exposed without monocular cues. Duration of the stimulus was controlled electronically in integer-multiples of the frame duration of the monitor. In both experiments the stereoscopic stimulus was configured as a square. Procedure

On each trial, the stereoscopic square was presented in the center of the display screen at a given disparity and depth position. A small black fixation point, binocularly viewed, was located in the center of the screen and located stereoscopically at the same distance as the screen (the fixation point was slightly larger than the background dots and static rather than dynamic). The observer's task was to maintain fixation of the fixation point (for brief exposures only) and to estimate the depth of the square from the display screen (fixation plane) by two methods. In the verbal method, the observer judged depth as a percentage of viewing distance (e.g., if depth between square and screen equaled one half the distance, the observer responded" 50"). In the probe method, the observer judged depth by instructing the experimenter to align a probe stimulus so as to appear coplanar with the stereoscopic square. The probe stimulus was a vertically oriented steel rod outside the stereoscopic display. The experimenter moved the rod in depth beside the display until 'the observer perceived the rod to be in the same depth plane as the square within the display. Because the rod was a physical object, multiple cues to its depth position existed (e.g., disparity, size, interposition). It is likely that the observer made full use of these cues in

HUMAN

FACTORS

judging the depth of the rod with respect to that of the square; therefore it is unlikely that the observer was simply nulling or matching the disparity of the two stimuli during the trials. For each observer under each experimental condition, one trial was performed with the verbal method and five trials were performed with the probe method. EXPERIMENT 1 This experiment investigated the effects of half-image separation magnitude, separation direction (crossed vs. uncrossed), and viewing distance on perceived depth. factors predicted by geometry to affect depth in stereoscopic displays. To create conditions similar to~lliose occurring in many applied settings (see Yeh and Silverstein, 1990), the stimUlus was exposed for an unlimited duration (Le., until the observer responded) on each trial. For these trials, the observer was told that the execution of eye movements and changes in fixation were permitted. Note that vergence eye movements would alter the magnitude and possibly direction (from crossed to Uncrossed or vice versa) of the disparity of the stimulus by varying the position of the observer's horopter relative to the display screen. This could be important in cases where changing the direction of the separation or disparity renders the depth of the object more difficult to perceive. Recall that crossed and uncrossed disparity are probably processed by functionally separate mecha_ nisms (Mustillo, 1985). Recent research from this laboratory (Patterson, Cayko, Flanagan, and Taylor, 1989; Patterson, Short, and Moe 1989; see also Patterson and Fox, 1984) sug~ gests that uncrossed depth may be invalid Under certain conditions (see discussion Section). One common method for preventing stim_ ulus-elicited vergence eye movements from occurring during experimental trials is to expose test stimuli very briefly-for example,

December 1992-659

DEPTH PERCEPTION

180 ms or less. Such durations are briefer than the latency of vergence movements (Westheimer, 1954). Under these conditions, although eye movements could occur, their execution would necessarily follow termination of the trial. Also note that brief durations may be relevant to certain applied situations wherein an operator must scan across a display and make depth judgments of multiple elements, thus having only a limited time available for estimating the depth of anyone element. Given these considerations, we decided to compare depth perception with a long stimulus exposure (allowing eye movements) and perception wi th a brief exposure (no eye movements). Thus in addition to measuring depth while the square was exposed for an unlimited duration, we also measured depth while the square was exposed for a duration of 160 ms. For these trials, the observer was instructed to maintain fixation of the fixation point before initiating an exposure. Several exposures were performed on each trial (a pause of several seconds occurred between each exposure), with the number of exposures (typically 5-10) determined by each observer individually. Multiple exposures were employed so as to permit confident and reliable judgments of depth to be made, which was not possible if the stimulus was exposed briefly only once during each trial. Design and Procedure

Two magnitudes of half-image separation (0.3 and 0.7 em), two directions of separation (crossed and uncrossed), two viewing distances (75 and 150 em), and two exposure durations (unlimited or long vs. 160 ms or brief) were factorially combined to make a 2 x 2 x 2 x 2 within-subjects factorial design. Because the display monitor and surrounding laboratory environment were clearly visible to the observer, we have assumed that perceived distance to the display screen equaled

physical distance in the calculations of predicted depth, given later. Verbal estimates obtained after formal data were collected suggested that this assumption is valid. The angular size of the stereoscopic square at the retina was 6.25 deg square at the two distances employed (Le., physical size of the square on the display screen was scaled with changes in distance so that retinal size was constant). The 16 conditions were presented to each observer individually in a haphazard order, during a session lasting 1.5 hours. These separations and distances produced disparities ranging from 7 to 32 arcmin. With stereograms, disparity is defined as r = 57.3 x SID, where r is disparity (in degrees of visual angle), S is separation between the half-images of the display, and D is viewing distance. All observers reported that the stereoscopic square appeared perceptually fused with no sign of double images. Erkelens (1988) showed that disparities as large as 1.0 deg or larger in the crossed and uncrossed directions can be fused in random-dot stereograms. In research employing methods of stimulus generation similar to those used here, Patterson and Fox (1990) presented to their observers a crossed disparity of 0.99 deg without problems of fusion. The maximum value of 32 arcmin in this experiment, and the maximum value of 36 arcmin in Experiment 2, are well within such fusional limits, which is why no double images were reported. Results

The results for perceived depth measured with the probe method were very similar to those determined verbally. In a previous study (Patterson, Cayko, Flanagan, and Taylor, 1989), probe estimates were correlated very highly (+0.96) with verbal estimates. Thus the conclusions that follow are not peculiar to a given measure of depth. Because depth is measured more precisely with the

660-December 1992

HUMAN

FACTORS

probe method, we present results obtained with that method. Figure 1 shows mean perceived depth for the crossed and uncrossed half-image (interocular) separations obtained with the two distances and the long exposure. The horizontal line above each histogram indicates the depth predicted by geometry. (In all predictions reported in this paper, a standard value of interpupillary distance of 6.5 ern was assumed.) Increases in both separation and distance produced increases in depth, which were slightly greater in the crossed relative to the uncrossed direction. A recent study by Parrish and Williams (990) also showed that increases in distance produce increases in perceived depth for both crossed and uncrossed directions. In the present study, for all combinations of separation and distance, perceived depth in the crossed direction is equal to prediction. Depth in the uncrossed direction is much less than prediction, especially for the large separation and distance.

Figure 2 shows the results obtained with the brief exposure. The pattern of results is similar to those shown in Figure 1, with even greater departures from prediction with the brief exposure. A four-way analysis of variance (ANOVA) for repeated-measures designs revealed that the following effects were reliable: main effect of separation magnitude, FO,10) == 64.93, p < 0.0001; main effect of distance, FO,IO) = 122.58, p < 0.0001; main effect of exposure duration, FO,IO) = 8.17, p < 0.02; interaction between duration and separation direction, FO,lO) = 4.95, P = 0.05; interac_ tion between duration and distance, FO ,10) 8.07, p < 0.02; interaction between duration and separation magnitude, FO,10) == )8.31, P < 0.002; interaction between distance and separation magnitude, FO, 10) == 39.18, p = 0.0001; interaction among duration, separation direction, and separation magnitude, FO,IO) = 5.54, P < 0.05; interac_ tion among duration, distance, and separa-

Figure 1. Perceived depth obtained under the crossed and uncrossed separation conditions and two viewing distances. lnterocular separation = separation between the half-images ofthe stereoscopic stimulus. LE = long (unlimited) stimulus exposure. Horizontal line above each histogram depicts depth predicted by geometry. Each histogram is an average of11 observers. Standard errors ranged from 0.3 to 1.8 em.

Figure 2. Perceiveddepth obtained under the crossed and uncrossed separation conditions and two View. ing distances. Interocular separation = separation between the half-images ofthe stereoscopic stimulus. BE = brief (160 ms) stimulus exposure. HoriZOntal line above each histogram depicts depth predicted by geometry. Each histogram is an average of11 observ_ ers. Standard errors ranged from 0.4 to 1.4 em.

=

DEPTH PERCEPTION

tion magnitude, F(l,IO) = 15.57, P < 0.01; the interaction among duration, separation direction, distance, and separation magnitude approached significance, F(l,10) = 4.44, P = 0.06. The results of Experiment 1 show that depth in the crossed direction followed predictions quite closely, especially when the stimulus was presented with a long exposure. Depth in the uncrossed direction was frequently less than predicted, especially when the stimulus was presented with a large halfimage separation, at a long distance, and/or with a brief exposure. When viewing distance varied, although angular size of the stereoscopic square was kept constant by appropriate scaling of its physical size, the angular size of the individual dots constituting the stereogram did vary because their physical size was not scaled. This means that the spatial frequency content (in the luminance domain) of the stereogram varied with distance: the amplitude of high spatial frequencies increased while that of low spatial frequencies decreased as distance increased. Thus changes in perceived depth with distance could have been produced by variation in spatial frequency rather than by changes in distance per se. To control for this possibility, we performed the following subsidiary ~xperirilent with two observers. In three experimental conditions, we measured -depth (probe method) with a half-image separation of 0.7 em in both the crossed and uncrossed directions and an unlimited exposure duration. In two conditions depth was measured at the 75 and 150 em distances, respectively, as outlined previously. In the third condi tion, depth was measured at the distance of 150 em while the observer viewed the display through a 2X-magnification telescope created from optometric lenses (physical size on the display screen of the stereoscopic square was not

December 1992-661

scaled for this condition). The telescope increased retinal image size of the dots (and of the square) so as to compensate for the increase in distance; thus spatial frequency content of the stereogram was held ·constant. For both observers the results showed that perceived depth in the crossed direction increased with distance in accordance with predictions, but uncrossed depth increased only slightly-or decreased-with distance, contrary to predictions. Importantly, perceived depth at the 150 ern distance was very similar for the telescope and nontelescope conditions, suggesting that the effect of distance on depth revealed in the main experiment was not produced by variation in dot size/ luminance spatial frequency. EXPERIMENT 2

Although stimulus size is not predicted by geometry to affect perceived depth, we observed during pilot testing that the size of the stereoscopic stimulus did influence depth, especially in the uncrossed direction: depth was less than predicted when the stimulus was small, and depth equaled prediction when the stimulus was large. Note that size in this case refers to the size of the disparity embedded in the stereogram, which is different from the effect of familiar size as a cue to distance or depth perception (i.e., large familiar stimuli appear closer and small stimuli appear farther away). Because the effect studied here pertains to th-e size of a simple geometric form (square), the influence of familiar size on depth perceptionshould be minimal. Indeed predictions of uncrossed depth based on familiar size would be opposite to the effect revealed in our pilot testing: uncrossed depth should have been greater with the smaller size and less with the larger size, because in the former case the stimulus would have appeared farther from the observer and in the latter case it would have

662-December 1992

HUMAN

FACTORS

appeared closer, a pattern of results that we did not obtain. This experiment formally investigated the size effect. We also examined the effects of half-image separation magnitude and separation direction on perceived depth in order to replicate results of Experiment 1. Also similar to Experiment 1, the stimulus was exposed both for a long (unlimited) and for a brief (160 ms) duration.

Design and Procedure Two magnitudes of separation (0.5 and 1.5 em), two directions of separation (crossed and uncrossed), two sizes (1.0 and 30.25 deg square), and two exposure durations (long versus brief) were factorially combined to make a 2 x 2 x 2 x 2 within-subjects factorial design. Viewing distance was 150 em. These separations: produced disparities ranging from 11.5 to 34 arcmin. The 16 conditions were presented to each observer in a haphazard order.

Results The results for perceived depth measured with the probe method were very similar to those determined verbally; the results using the probe method were as follows. Figure 3 shows mean perceived depth for the crossed and uncrossed half-image (interocular) separations obtained with the two sizes and the long exposure. The horizontal line above each histogram indicates predicted depth. Increases in both separation and size produced increases in depth. With the large stimulus, the increase in depth with separation was greater in the uncrossed direction than in the crossed direction; with the small stimulus, the increase in depth with separation is greater in the c~ossed direction. Perceived depth was equal to prediction in both crossed and uncrossed directions with the large stimulus, but depth was less than

Figure 3. Perceived depth obtained under the crossed and uncrossed separation conditions and two stimu: lus sizes (the size given in the legend is for one dimen_ slbn ofthe stimulus only, height or width). lnterocu_ lar separation = separation between the half-images of the stereoscopic stimulus. LE = long (unlimited) stimulus exposure. Horizontal line above each histo_ gram depicts depth predicted by geometry. Each histogram is an average of 17 observers. Standard errOrs ranged from 0.2 to 4.4 em. predicted with the small stimulus, especially for the 1.5 em separation in the uncrossed direction. Figure 4 shows the results obtained with the briefexposure. The pattern of results is similar to those shown in Figure 3, with even greater departures from predictions with brief stimulation. , A four-way ANOVA for repeated-measures designs revealed that the following effects were reliable: main effect of separation mag_ nitude, F(1,16) = 69.64, p < 0.0001; main effect of separation direction, F(l,16) = 5.30, p < 0.05; main effect of size, F(1,16) = 35.48, p < 0.0001; main effect of duration, F(l,16) == 23.03, P < 0.001; interaction between separs, tion direction and size, F(l,16) = 36.27, P

Factors that affect depth perception in stereoscopic displays.

This study investigated several factors that affect depth perception in stereoscopic displays: half-image separation magnitude, separation direction (...
1MB Sizes 0 Downloads 0 Views