VLFionffes. Vol. 30, No. II, pp. 1877~1895,199O Printed in Great Britain. Ail rights tescrwd

A STEREOSCOPIC VIEW OF VISUAL PROCESSING STREAMS C!~USTOFHER W. TYLER Seth-Kettiewe~

Eye Research Institute, 2232 Webster Street, Satt Francisco, CA 94115, U.S.A. (Received 10 August 1989; in revisedform 30 March

1990)

Akract-Recent anatomical and physiological studies of the visual pathway suggest the existence of at least three parallel processing streams in the lateral @culate/primary cortex sm magtto/ interblob stream for motion and transient information; a parvo/interblob stream for high spatial frequency, static information; and a parve/blob stream for chromatic and low spatial frequency information. How does this functional typology relate to the processing for stereoscopic depth? Human stereopsis may be viewed as consisting of three distinct types of disparity proces&gz w, local stereopsis suitable for stereomovement processing by the ~~o~in~blob stream; fine, global s&mops& suitable for the processing of complex random-dot stereograms by the ~~o/~~blob atream; and simple, protostereqsis for proassittg size differences between the two eyes by the parvojblob stream. Extensive psychophysical evidence supports the identification of these three disparity prm with the three processing streams. Stereopsis

Color

Motion

Stereomotion

NEURAL STREAMS AND PSYCHOPHYSICAL SYSTEMS

Recent physiolo~cal studies of primate visual cortex and its inputs suggest the existence of at least three parallel processing streams. These are defined histologically in terms of the layering of the lateral geniculate nucleus (LGN) into parvocellular and magnocellular layers, and of their projection to striate cortex terminating in histological blob and interblob regions defined by cytochrome oxidase autoradiography (described in Hubel & Li~ngstone, 1987; DeYoe 8r Van Essen, 1988). It is appropriate for me only to summarize the conclusions of the original authors, especially considering that only sparse information is available for many of the physiological functions considered. In broad outline (Fig. l), three functional streams are identified from the combined anatomical and physiological studies; a parvo > blob stream processing mainly chromatic and low spatial frequency info~ation (PB substream); a parvo > interblob stream emphasizing high spatial frequency, static information (PI substream) and a magno > interblob stream predominantly for motion and transient luminance information (M stream). These streams serve as an inspiration for the psychophysical partition of sensory processing into categories of specialized analysis. ~ychoph~i~lly, a

Cyclopean

Temporal frequency

coherent picture has emerged in which high temporal frequency, low spatial frequency and motion targets have one set of properties (the ~ychophysi~ M system), while position targets with the remaining combinations of spatiotemporal attributes have a different set of properties (the psychophysical PI system). Finally, purely chromatic targets with ltinance equated at all points in the image have third set of properties and constitute a psychophysical PB system. It is proposed to use these sets of conditions to explore how stereopsis fits into this psychophysi~ scheme, flout prejudice in relation to the underlying anatomical and physiological mechanisms. For many perceptual attributes, the physiological substrate is investigated by means of stimuhrs variables derived from psychophysical analysis. It is therefore worthwhile to develop the psychophysical analysis of a perceptual attribute in its own right, before applying the results to further physiological inv~ti~tion. Neural processing of stereopsis

The main question to be addressed is how the tripartite functional typology might relate to the processing for stereoscopic depth, Livingstone and H&1(1987), for example, assign all stereoscopic processing to the magno stream, largely on the basis that cells in this stream are typically

1877

CHRtsToPaER

1878

w. TYLER + Coarse stereopsis

Receptors

( ti Fine stereopsis

)r Protostereopsis

Retina

LGN

VI

Fig. 1. simplified diagram of neural processing streams at three levels within the visual pathway (retina, lateral geniculate nucleus--LGN, and striate visual cortex-Vl). The imputed role of each stream for stcrcopsis is indicated at the right side.

binocnlar with selectivity for the stimulus disparity, and that stereopsis is subs~ntially weakened for i~l~nant chromatic stimuli, as are the magno cell responses. However, this logic neglects the existence of different types of stereopsis with different sensitivities to chromatic isolation (reviewed below). It also neglects the presence of the interblob projection of the parvo stream, which is also probably insensitive to color differences and therefore would also be impaired at isoluminance. A stereoscopic system in this PI substream would thus mimic the color-blind behavior of the M stream. DeYoe and van I&en ~1988)claim a stereoscopic role for the PI substream on the basis that most cells in this stream are binocular, although they have not shown a specific coding for binocular disparity in these cells. In support of the latter claim of stereopsis in the PI substream, Chiller, Logothetis and Charles (1990) report that the effects of selective lesions in either the parvo- or the magnocellular layers of the LGN of monkeys provide a strong basis for behavioral stereopsis in the parvo stream. With finede~il~ randomdot stereograms as test stimuli, they showed that monkeys’ behavioral detection performance was unaffected by severe magno lesions, but profoundly affected by corresponding parvo lesions. These results place fine cyclopean stereopsis firmly in the parvo stream, but it must presumably be placed in the PI substream since it is strongly degraded at isoluminance (Lu & Fender, 1972; de Weert, 1979), and would therefore not be well-supported by the color-coded cells of the PB substream. Conversely, stereopsis for coarse dot arrays

partially survived the parvo lesions, suggesting that local disparity processing might be mediated by the magno rather than the parvo pathway (Schiller et al., 1990). Thus the current evidence suggests that at least two of the identified information streams are involved in stereoscopic processing. The purpose of this paper is to ar8ue that all three streams may serve a role, each in a different aspect of stereopsis. Categories of human stereopd This section provides a summary of the types of stereopsis that are delved and reviewed in more detail in subsequent sections. Human stereopsis may be viewed as consisting of at least two categories of disparity processing: fine, global stereopsis suitable for complex randomdot stereograms (Jules& 1971) and coarse, local stereopsis suitable for stereomovement processing (Tyler, 1971, 1975a; Richards, 1972; Regan & Beverley, 1973a). Random-dot stereopsis is fine in the sense that it processes fine image detail over a small disparity range, but global in the sense that it must disambiguate the multiplicity of false matches in the random-dot arrays. Conversely, stereopsis for large, simple targets is coarse in the sense of extending to large disparities, but local in the sense that no global disambiguation mechanism is required. Both types, however, depend upon positional binocular disparity cues for the depth information. There is evidence for a third type of stereopsis which does not rely on positional disparity cues. This is the perception of depth tilt from a difference in spatial frequency between the two

Visual pfoe&ng

eyes, which can be driven by the frequency ratio

steams

1879

other factors, so it should not be taken as de&n rive. relations (Blakemore, 1970; Tyler & Sutter, DeJnitions. It is important to distinguish 1979). Because there were indications that several dichotomies related to the distinction this perception of depth tilt was a more between fine and coarse aspects of stereoscopic primitive neural capability, it was labeled processing. The concept of subdivisions within “protostereopsis” by Tyler and Sutter (1979). a perceptual system to process different ranges Since positional disparity processing could of the relevant stimulus domain is common. be excluded from con~deration, this type of Typically, these ranges are specified in terms of depth perception must be analyzed by a third the stimulus size (or spatial frequency content). mechanism of some kind. In such a case, “fine” and “coarse” would refer Extensive psychophysical evidence can be to high- and low-spatial frequency ranges, remarshalled to support the identification of spectively. In stereopsis, however, the situation three categories of stereopsis with the neural is more complicated, since the dichotomized processing streams. In broad summary, this stimulus domain may be disparity rather than paper will show that fine stereopsis resembles a spatial frequency, or a combination of the two parvo/interblob mechanism because it has a variables. The fine/coarse disparity distinction slow response latency, requires a high acuity for may include part or all of the distinctions random-dot stereopsis, shows a position-specific implied by several other dichotomies that limit for disparity oscillation and is strongly have been discussed in the literature. These degraded for equil~inant chromatic stimuli. are the l~l~global, ~clo~an/noncyclo~n, Coarse, off-horopteral stereopsis resembles a and fu~/diplopic disparity distinctions. magno mechanism because it is as rapid as any motion response, extends to greater eccen(i) Local/global. Local mechanisms will be tricities, and shows a motion-specific limit for defined as those that involve no interactions at disparity oscillations. or beyond the initial disparity processing level, However, the question of the attribution whereas global mechanisms are those that do of other components of stereopsis among the have such interactions. Thus local disparity physiological subdivisions is more complex than mechanisms are those that process disparity in is suggested by this simple snapshot. After one region on the visual field without reference reviewing the evidence for a bipartite division of to the disparities present in other regions of the stereopsis into fme and coarse processes with field. They may also be local with respect to different properties, I will assemble the indi- other disparities present in the same region of cations for other components of stereopsis, and the field. Under this definition, no constraint is explore how they might be assigned to the placed on the magnitude of the disparity that physiological streams. can qualify as local. (ii) Cyclopean/noncyclopean. Although Julesz (1971) has tended to regard “cyclopean” as equivalent to “global”, this identification blurs CONCEPTUAL ANALYSIS OF DISPARITY an important distinction between stimulus and MECHANISMS mechanism. “Cyclopean” will be defined here as Separation of Jine and coarse mechanisms in a term to be applied to any stimulus ftature static stereopsis which is invisible monocularly but visible by In the functional att~bution of stereopsis it means of the dispa~ty-pressing mechanisms. is important to recognize that stereopsis is Such mechanisms may themselves be local or not a unitary mechanism but must be sub- global, in terms of definition (i), whether the divided into functional components. It will be stimulus is cyclopean or not. argued that the most important subdivision is Note that on the one hand this definition between fine and coarse stereopsis, which oper- does not limit cyclopean stimuli to randomate over different disparity ranges. The evidence element stereograms, while on the other hand to be reviewed suggests that fine stereopsis stimuli that are cyclopean in one mode of operates in a range up to about 20 min disparity, presentation may become noncyclopean in with coarse stereopsis operating on disparities another. An example of a nonrandom-element beyond that range. The specific crossover cyclopean stimulus is a grating of different value depends on the stimulus eccentricity and o~en~tions or spatial frequencies in the two

per se rather than by the positional disparity

1880

cHRIsTopHERW.TYLEK

eyes. Each monocular stimulus alone carries no hint that the depth information encodes a tilted plane, so the stimulus should be regarded as cyclopean. Conversely, an example of reversion from a cyclopean to a noncyclopean state occurs when two (cyclopean) static RDS with different disparities are presented in succession; there is a monocular1y visibie d~eren~ation of the central square. In order to render the change invisible monocularly, dynamic RDS must be used to provide a continuous monocular change with which to mask the disparity-specific change. Thus cyclopean is an operational concept that must be defined in relation to the specific stimuli to be used. (iii) Fusion/diplopia. Fusion of the two local monocular images into a single binocular percept, or the failure of such fusion resulting in two perceived images (diplupia), has been known since the time of Panum (1358) or before. One myth that shou1d be dispelled is that the transition between fine and coarse disparity processing (or for either of the other dichotomies) is connected with Panum’s fusional limit. While stereopsis operates on horizontal disparities, Panum’s limit is a property of the fusion mechanism for either horizontal or vertical disparities, and hence must involve a mechanism independent from stereoscopic disparity processing (see Tyler* 1983). Moreover, diptopia does not exist for RDSs. No matter how targe the disparity, an RDS will not appear diplopic in the sense of being perceived with twice the dot density, although the static type of RDS will eventually go into rivalry (Tyler, 1975b). Dynamic RDSs do not exhibit rivalry, and are in some sense always fused. To summarize, the limits of global, cyclopean and fine disparity processing are each logically independent of each other, and of the fusion limit. That there may be empirical coincidences for a particufar stimulus con~g~tio~ does not detract from the logical and functional independence of these limits over the range of potential stimuli. The present treatment will emphasize the separate mechanisms for the processing of fine and coarse disparite stimuli. It should be stressed that in principle each of these mechanisms might operate by either local or global processes, and also that each might operate on either cyclopean or noncyclopean stimuli. In practice, there is usual1y a correspondence between global processing and

fine disparities, although local processing may occur for either fine or coarse disparities. Exultation of the i~~ailglobal d~t~et~on. 3y

definition, a local disparity mechanism is one in which the disparity at one location in the field is processed without reference to the disparities present at other locations. However, local disparity processes may well be affected by retinal stimulus features, such as the siie and orientation of the local retinal images carrying the disparity signal. Local processes may also be affected by retinal image features at other locations on the retina, as exemplified by retinal lateral inhibition, This may be regarded as a retinal global&y (as opposed to the cortical globality in which interactions occur between local disparity detectors in the cortex). A final feature of local processing is that it is not necessarily limited to small disparities. In fact, the evidence to be reviewed suggests that it is the iarger disparities which are processed by Iocal m~ha~sms. Global, interactive processing, on the other hand, does seem to be limited to the small disparity range. Many different types of global processing have been proposed, from lateral inte~ctio~s between disparity detectors (Nelson, 1975) to iterative computation of the best stereoscopic figure in the disparity array (Julesx, 1962; Man & Poggio, 1979). But a de&&ion of global disparity processing that encompas~s them all is that it invo1ves interactions between local disparity detection mechanisms. A version of the fine/coarse dichotomy was embodied in Ogle’s (1950) distinction between patent and qualitative stereopsis, In the fovea, Ogle gave the limit of patent stereopsis as about 10 min, although he does not specify what criterion was used to define it (it was not fusion, because this had a separate, lower limit). Julesz (1962) developed the concept of a global system which was supposed to operate only on fme disparities, which he typicalty specifies as less than 6 min. He also recognized the existence of a separate local system that would account for the perception of large disparities in the range of qualitative stereopsis. Thus he not only postulated separate fine and coarse disparity mechanisms, but associated the fine mechanism with both global and cyclopean processing, while the coarse disparity mechanism was considered to be local and noncyclopean. This separation of stereoscopic processing into two types, fine and coarse disparity process, has been a

visual processing streams

I

0.2

0.5

1

2

5

1881

.l

10

Frequency(Hz)

.2

.5

1

2

5

10

Frequency(Hz)

Fig. 2. (A) Explanation of the velocity limit for sinusoidal oscillations. Waveforms depict stimulus position (ordinate) as a function of time (abscissa). To have equal slopes of maximum velocity (arrow). the amplitude must be scaled in proportion to the temporal period (decreased in proportion to the increase in frequency). (B) Velocity limit data. Movement detection threshold as a function of temporal frequency for sinusoidal (0) and triangular-wave (0) motion in counterphase in two lines each at 1.5 deg from the fovea. Inset shows that when amplitudes arc matched, the maximum velocity in the triangular wave is lower by a factor 1.57. as shown in the predicted threshold elevation of the same factor (distance between two full curves). Low frequency data conform to slope of - 1 predicted by maximum velocity limit in (A). (C) Position limit data for two observers (0, n ). Stimulus was a sinusoidal wavy line with a spatial period of 0.33 dcg (inset). Thresholds were invariant below 2 Hz, as predicted from the position information available in the maximum amplitude of the stimulus. Error bars represent f 1 average standard deviation over each data set.

feature of many recent theories (see Julesz, 1978). Supporting psychophysical evidence will be reviewed in a subsequent section, with the predominant fine/coarse dichotomy being drawn in most cases at a disparity range of 15-20 min. Temporal distinctions position systems

between

motion

and

A second distinction that will be important in dissecting the pathways of stereopsis concerns the temporal properties of the response. The temporal dimension is useful to isolate the separate contributions of motion and position systems to stereoscopic sensitivity. One paradigm that is particularly suited to genaralization to the stereoscopic domain is the oscillation threshold technique developed by Tyler (1971)

and Tyler and Torres (1972). The threshold behavior at low temporal frequencies is the key feature for discrimination between motion or position control of oscillation detection. The signature of a true motion system is that threshold should be velocity-limited. For sinusoidal oscillation, this translates into a scaling of the stimulus waveform with temporal frequency, so that the maximum velocity (at the zero crossing point) remains constant at detection threshold (see Fig. 2A). Threshold amplitude should therefore decrease inversely with temporal frequency when controlled by a velocity-limited system, conforming to a slope of - 1 on log-log coordinates. Figure 2B shows that the velocity-limit prediction is well-validated by data for either sinusoidal or triangular oscillatory motion at 1.5 deg

CHUTOPHER

1882

W. TYLER

A

0 Log

B

-t-

With reference

-e-

No reference

1

2

temporal frequency (Hz)

-

1

0 Log

1

tomponl frequency (Hz)

Fig. 3. (A) Velocity limit governs low frequency motion detection in peripheral view. Stimuli were bright lines 1 min wide and 1 deg high at 20 deg eccentricity. Error bars represent f 1 standard deviation. Data conform well to a slope of - 1 (straight lines) up to 3 Hz without a reference (0). and show very similar behavior when a static reference tine is placed at 20 min separation from the stimulus (m). (B) In fovea1 vision, the data again conform to a velocity limit without a reference (O), but tlatten out to approximate a frequency-invariant position limit when the static reference is present at 20min separation.

eccentricity (from Nakayama & Tyler, 1978). Conversely, a system whose threshold is totally position-limited should show a fixed amplitude threshold across low temporal frequencies, because threshold should be independent of the temporal properties of the stimulus, and controlled purely by its spatial position relative to a comparison line. This result is best shown (Fig. 2C) by data for discrimination of the deviation from straightness on a sinusoidally wavy line, which contains many local position references (Nakayama & Tyler, 1981). Thus the slope of the threshold function at low temporal frequencies provides an indicator of the operation of position vs motion-limited processing. Tyler and Torres (1972) had demonstrated separate processing systems for motion and position using sinusoidal oscillations of line targets. The periphery from 5 to 20 deg was essentially blind to position information (Fig. 3A), because: (1) peripheral motion sensitivity was similar with and without an adjacent reference line; and (2) at low frequencies, peripheral thresholds decreased inversely with frequency, as expected for motion sensitivity.

Conversely, foveal sensitivity to the oscillating line (Fig. 3B) behaved as though the fovea has a processor of position information, in addition to a motion-sensitive mechanism, because: (1) it was sensitive to the presence and distance of a static reference line; (2) sensitivity was roughly constant in the low frequency range, as if threshold required the occurrence of a minimum fixed displacement independent of the velocity (Fig. 3B); and (3) masking of the fixation point as a position reference reduced fovea1 sensitivity to the velocity-sensitive form of the periphery. In summary, the data were completely consistent with the notion of a predominantly static, position-sensitive mechanism dominating oscillation sensitivity in the fovea, giving way to a purely motion-sensitive mechanism in the near periphery. This picture was a precursor of the current psychophysical view of the characteristics of the sustained ( = position-sensitive) fovea1 P system and the transient (=motion-sensitive) peripheral M system. However, the approach through line motion contains some significant advantages over the more conventional grating stimuli: (1) line motion sensitivity measures have unlimited amplitude and can be performed at any contrast, as distinct from grating motion

2

Visual processing streams

sensitivities which are limited to half a cycle (or to the measurement of contrast threshold for grating drift), and are performed by definition around contrast threshold; (2) empirically, position and velocity sensitivities may be separated by up to a log unit at low frequencies. The double logarithmic slope thus provides a clear assay for the relative strength of the two processes with eccentricity; (3) line stimuli provide a straightforward extension to the stereo case, as required for this analysis. The simplest approach is to present the sinuosidal line motion in counterphase in the two eyes, producing an impression of pure motion in depth, but other paradigms with more complex rotatory paradigms have also been developed (e.g. Regan & Beverley, 1973b,c). PHYSIOLOGICAL JWIJMNCE FOR SEPARATE PROCESSING MECHANISMS IN STEREOPSIS

Neurophysiological studies

Poggio, Gonzalez and Krause (1988) have extended the work of Poggio and Fischer (1977) to suggest that binocular interactions of neurons in the behaving monkey cortex may be grouped into a number of distinct classes. Neurons which are predominantly binocular (in the classical sense of having identifiable receptive fields for monocular stimulation of each eye) showed a variety of types of disparity selectivity. The types identified were as follows: F-flat (excitatory for all disparities); TN/TF-tuned near/far (excitatory with nonzero disparity tuning); TO-excitatory with zero disparity tuning; T&inhibitory with zero disparity tuning. More surprisingly, they found that most cells with classically monocular receptive fields had “reciprocal” disparity sensitivity, showing binocular facilitation for a broad range of near disparities and binocular suppression for a similar range of far disparities (RN cells), or vice versa (RF cells). Finally, there was a group of cells with unclassifiable responses to static disparity. Poggio and Talbot (1981) have suggested that many of these cells are selective for motion in depth (DM cells). Cells with flat tuning (F) comprised 50% of the sample in area Vl, with the other five groups accounting for roughly equal proportions of the remaining cells. Note that cells with a flat disparity tuning are unresponsive to disparity differences, although they show binocular facilitation at all disparities.

1883

These results suggest a possible neural basis for fine and coarse stereopsis, based on the different types of receptive field wiring rather than different disparity ranges of the same type of receptive field. (Any such interpretation must still be viewed with caution, however, since Poggio’s studies have included only horizontal disparities. If the same cell types were found for vertical disparities, the results would have to be attributed to nonstereoscopic mechanisms, since stereopsis operates only on horizontal disparities.) A natural interpretation (Poggio dz Talbot, 1981) is that cells of the tuned excitatory type (TN, TO, TF) would correspond to fine stereopsis, since those virtually all had tuning peaks within 12 min of zero disparity in area Vl. Near and far reciprocal cells (RN and RF) would provide the basis for coarse stereopsis, since their facilitory response may extend out to l-2 deg disparity. The inhibitory type (TI) might well be involved in the mechanism of binocular rivalry. It could then interact with the stereoscopic system by activating the suppression of stereopsis when stimuli occur beyond a certain range of disparities, or beyond a certain proportion of binocular mismatches. Evoked potential studies

An electrophysiological technique which is applicable to the analysis of human stereopsis is the study of disparity evoked potentials (DEPs). The properties of the DEP for different types of stereoscopic stimulus help to distinguish the mechanisms underlying human disparity processing. Initial work in DEPs implied the existence of two separate disparity systems on the basis of the response speeds implied by the latency of the transient DEPs. Regan and Spekreijse (1970), using a monocularly-visible shift in the center panel of a static RDS, found a DEP component which had the same latency (about 130msec) as the response to the monocular shift alone, but which resulted in a much larger amplitude when the shift was produced by a horizontal binocular disparity. Lehmann, Skrandies and Lindemaier (1978) and Julesz, Kropfl and Petrig (1980), on the other hand, used dynamic RDS in which no monocular cues could be detected. The disparity-specific response to this stimulus showed a dominant latency of about 250 msec in both studies, which must be attributed to purely cyclopean cortical responses. Thus the cyclopean responses were much slower than in the experiment with

1884

-PEER

monocular cues, suggesting that global stereopsis required a longer neural processing time than did local stereopsis. Further support for the local/global mechanisms in human stereopsis comes from an evoked potential study by Norcia, Sutter and Tyler (1985). They used a dynamic RDS in which a depth plane alternated between equal crossed and uncrossed disparity positions. The entire display area was occupied by this plane, not merely a central square. A fixation marker with nonius lines was provided to control vergence eye movements. The evoked potential was recorded from bipolar electrodes near the inion and synchronized to each stimulus event (reversal of the disparity plane from crossed to uncrossed, or vice versa). Three types of data can be obtained from this kind of recording; response amplitude, phase of the response relative to the input stimulus, and delay of the response with respect to the stimulus. Response delay is computed from the rate of change in response phase as a func~on of the temporal frequency (Regan, 1972), and is therefore logically independent of the phase at a given temporal frequency. The differentiation between fine and coarse disparity mechanisms was evident in the data of Norcia, Sutter and Tyler (1985) for all three measures. The response amplitude showed a peak at about 15 min with a pronounced dip between 20 and 40 min (depending on temporal frequency), followed by a second peak as disparity was increased. Tbe phases of the responses were different for the small and large disparities, and were separated by reliable phase differences between the two peaks (15 and 70min). The delay computed at these peak

W. TYLER

~p~tud~ was about 5Omsec shorter for the coarse disparity peak than for the fine disparity peak. Thus it was concluded that the two ranges of disparity are processed by diiete, separable mechanisms which may correspond to Poggio and Fischer’s (1977) binocular neural classes (TN, TO and TF) for fine disparities and (RN and RF) for coarse disparities respectively. It is also possible that the ranges of disparity processing identified in the DEP represent responses from diierent retinal eccentricities. The sharp di~on~n~ty in temporal phase between the ime and coarse disparity ranges is hard to explain on the eccentricity hypothesis, however. PsYcHoPHYsIcAL ANALYSIS OF SEPARATE PROcX?SHNG MFEHANHM!3 IN !STEREOPSIS

Psychophysical evidence for a distinction between mechanisms processing fine as opposed to coarse disparities exists in the form of different perceptual phenomena in the two disparity ranges. One of the more direct examples is in the appearance of multiple depth planes. When one views an RDS containing two overlaid disparity planes with a small disparity separation, they are fused to form the perception of a single dense plane (to be called pyknostereopsis-from the Greek for dense), whereas in the range of larger disparity separations between the two depth stimuli two transparent depth surfaces may be seen (to be called dia-stereopsis-from the Greek for separate or transparent). Within the pyknos~reo~pic range there is depth averaging of the depths of the component planes (Fig. 4A), while the diastereoscopic range gives the three-dimensional equivalent of the lateral

Fig. 4. (A) Depiction of the pyknostereoscopic range in which two disparity planes give a percept of a single thick depth plane. (B) For larger separations between the disparity planes a diastercoscopic percept of two transparent planes is obtained.

Visual processing streams

diplopia (or even polyopia) beyond the range of binocular fusion (Fig. 4B). Schumer (1979) was the first to study depth averaging in the pyknostereoscopic range using dynamic RDS stimuli. He found that the averaging was approximately linear for a variety of stereofigures and flat planes, up to disparities of about 20min. Beyond this value the two separate stereofigures were perceived and no averaging of the separate depth surfaces occurred. Parker and Yang (1989) reported a narrower range for averaging of only about 3 min for static random-dot stereograms. However, their estimate may have been unduly restricted by the presence of local patches containing predominantly one of the pair of disparities, since they used a noise-generation algorithm which emphasized the lower spatial frequencies. As they pointed out, the spatial extent of all the local disparity regions needed to be substantially smaller than the spatial integration area of about 4 min (Tyler, 1974; Parker & Yang, 1989), in order to present both disparities to be averaged within each spatial integration region. Since the stimuli contained patches larger than this limit, there was ample opportunity for the multiple surfaces to be perceived by means of lateral depth comparisons rather than by true diastereoscopic transparency. Other perceptual distinctions seem to be related to the same functional dichotomy. With a psychophysical technique, Michell and Baker (1973) found that the depth aftereffect produced by prolonged inspection of a disparate line target was optimal for a 5 min adapting stimulus, but could not be obtained for disparities greater than 15-20 min. Felton, Richards and Smith (1972), however, have shown that the optimal disparity for the production of this aftereffect depends on the size of the bars used, with larger bars producing optimal aftereffects at larger disparities. Blakemore and Julesz (1971) reported that their best disparity aftereffect occurred rather narrowly around 10 min for a static RDS. The depth aftereffects produced by the inspection of disparate stimuli thus seem to be characteristic of fine rather than coarse disparities. Another effect that occurs only in the fine disparity region is induced movement in the disparity domain (see Tyler, 1983). A dynamic RDS contained a set of stereoscopic bars alternating between two positions in depth while the intervening regions remained fixed in

1885

depth. At small disparities the alternating bars induced counterphase motion in their stationary neighbors. At larger disparities, induction was not seen. Here again, the different perceptual properties of the fine and coarse disparity regions support the existence of a separate fine mechanism. Jones (1977) has shown that persons with normal stereoacuity (fine stereopsis) may have vergence eye-movement and perceptual anomalies associated with large disparities. He also suggested that the temporal properties of coarse and fine stereopsis may be different. Similarly, Langlands (1926) and Ogle and Weil (1958) have reported that fine stereopsis was improved with long exposure durations, such that maximum stereoacuity was obtained for about 3 set exposure. Conversely, coarse disparity judgments are optimal with short exposures (Ogle, 1962). Thus, the different temporal properties for depth detection in the two disparity ranges provide further evidence that separate mechanisms are involved. Finally, a functional dissociation between cyclopean and noncyclopean stereopsis was identified by Shimojo and Nakajima (1981). They adapted the stereoscopic system to inverted disparity information by wearing goggles that switched the images in the two eyes while going through their normal daily activities. Perceived depth from simple line stereograms was converted to the reversed direction after two days of inverted disparity experience, but random-dot stereopsis had not switched after a week’s adaptation. It therefore seems that the local and global stereoscopic systems have different degrees of susceptibility to the adaptation of disparity reversal, with global stereopsis showing no evidence of such adaptation. The role of multiple spatial channels in disparity processing Development of the concept of a few separate processing systems is not intended to preclude recognition of multiple processing channels within any system. Obviously, there will be separate local channels (neurons) processing each local region of the stimulus. However, concept of multiple channels is usually invoked for channels responding to different stimulus attributes in the same retinal region. Of particular relevance in stereopsis is the possibility of multiple channels for spatial

1886

CHillsfopHeR w. -IkLm

frequency content, postulated by Tyler (1973) and subsequently by Marr and Poggio (1976). Schor, Wood and Ogawa (1984) characterized the disparity sensitivity of spatiallytuned channels by means of stimuli with Difference-of-Gaussian (DoG) luminance profiles. They found different spatial tuning properties for wide and narrow DOG stimuli. For DOG centers wider than about 0.3 deg, static disparity was processed by a set of channels with a tuning appro~mating that of the DOG profile. There was a s~e~ispa~ty correlation for static stimuli, as predicted by Tyler (1973) such that sensitivity for larger fields fell in direct proportion to their width. (Dynamic disparity changes gave a similar picture of multiple size-tuned disparity channels, but the sensitivity reduction was more gradual with size.) In this size-scaled regime, the sensitivity could have been mediated by a set of local spatial channels with a specific interocular phase relationship; for example, a Gabor-shaped receptive field on each retina with an interocular phase relationship of 90 deg. Binocular facilitation of the response would have to be included in this model to narrow the tuning functions, since the measured tuning of the channels was narrower than that of the DOG stimuli used to measure them. For DOG stimuli whose centers were narrower than about 0.2 deg a different picture was obtained (Shor et al., 1984). The disparity appeared to be processed by a single channel that was insensitive to the stimulus width and there was no size-disparity correlation. This behavior, which at first sight seems simpler, actually requires a more complex mechanism to explain it. The channel must be able to accommodate DOG stimuli over an 8 : 1 ratio of sizes and disparities, and yet exclude those beyond a center size of 0.2 deg. This behavior s&ems to require a mechanism which can respond to small stimuli over a large range of disparities, since Schor and Wood (1983) showed that the upper disparity limit for depth detection in this range was about SOmin, even for DOGS as narrow as 3 min center width (a 16:l ratio). Such processing is reminiscent of a global mechanism of the type postulated by Julesz (1971), but more specific experiments designed to test this comparison with multiple stimuli and locations would be required to establish that relationship. There may be a connection between the two spatial-frequency processing regimes revealed in

the data of Schor et al. (1984) and the coarse/ fine disparity regimes described in the previous section. With an upper disparity limit of about 50min for the finer stimulus grains, the DOG data suggest a rather larger range for the fine disparity mechanism than the other studies reviewed. However, it should be noted that the DOG stimuli are sufficiently selective that there should be no competition from the coarse mechanism to limit performance. So this technique would measure the extreme limit of the fine rn~ha~srn without interference. Nevertheless, if there is a relation to the coarse/fine disparity mechanisms, the data suggest that the coarse mechanism contains a continuum of disparity channels, and emphasize that the coarse, local processing of these channels actually extends to the lower limits of the fine disparity range. Thus one could not exclude local disparity processing by restricting stimuli to small disparities. One could only eliminate fine disparity processing by selecting stimuli only with large disparities. Response dynamicsfor the input to stereopsis

If stereopsis is mediated solely by the Msystem, and if the M-system cells have predominantly phasic responses requiring transient stimulation for their activation, then depth perception should be optimal with transient stimulus presentation, and be impaired without it. Is there evidence to support this view? A simple demonstration that transient stimulation is important in stereopsis was reported by Julesz and Tyler (1976). It was previously known that viewing a static RDS that was uncorreiated in the two eyes produced little depth sensation beyond the indistinct perception of a lustrous surface (Tyler, 1971). However, continuous lively depth fluctuations seen in dynamicuncorrelated noise. Every dot seems to have its own associated depth signal, which conforms to those expected from the “random spatial disparity hypothesis” (Tyler, 1977). Thus, transient stimulation present in dynamic noise enhances the stereoscopic effect. Conversely, Regan, Erkeiens and Collewijn (1986) showed that slow rates of motion in depth produced no sensation of depth movement, especially for large, multidot stimuli with no stationary reference. They recorded the eye positions during the task, so they were able to exclude the possibility that vergence tracking was so effective that it eliminated the retinal disparity changes.

Visual processing streams

1887 CWT 15 MIN

-2OOrq

“O”

1,

2

8

4

8

6

1

8

1

10

p

12

DEPTH

I

I

14 16 REVERSALS

(

I

18 20 PER SEC

I

t

I

I

I

22

24

26

28

30

,

32

Fig. 5. Effect of temporal frequency on synchronous evoked potential amplitudes to dynamic random dot stimulation. Stimuh~ was a cyclopean plane alternating in disparity symmetrically around fixation. Note increase in response amplitude to a peak at about 6 depth nversaJs/sec (3 Hz) from virtually no response at the lowest frequencies. Response phase shift shown in lower panel. DEP is measurable up to about 30 rps (arrow shows highest coherent response frequency), close to the frequency where no depth alternation was detectable (A), but well beyond the frequency where smooth apparent depth motion disappeared (A).

Another piece of evidence in support of the of the potential retinal stabilization artifacts importance of transient responses in stereopsis have apparently been performed to verify this is the temporal tuning of the DEP. In a study of lack of depth processing from retinally static evoked potential responses to disparity alter- stimuli. nation of a dynamic random-dot stereogram However, apparently contradictory evidence depicting a flat plane, Norcia and Tyler (1984) is reported in the form of reports of cyclopean found a pronounced response peak at 3 Hz, with stereopsis from binocularly stabilized images almost no detectable response by a reduction to (Fender & Julesz, 1967; Piantanida, 1986). But 1 Hz disparity alternation rates (Fig. 5). Percep- stabilization is a relative concept, since if it were tually, the depth change was highly visible at perfect the dots of the stereogram would themthese low rates, but the perceived position selves disappear. Nonetheless, both studies between changes was much more uncertain, reported that the cyclopean depth would disdespite the dynamic noise base. Perception thus appear for periods even when the monocular seemed to be driven by the brief transients at images were visible. Thus, the depth perception each alternation, which were too sparse to seemed to be more adaptable than the mongenerate a strong DEP. The sustained stimu- ocular texture perception. This is consistent lation at a constant disparity between alter- with the view that stereopsis requires disparity nations was insufficient to produce either depth change to occur, and the small vergence moveperception or a good DEP response. ments provided sufficient disparity change for The corollary of the M/P prediction is that stereopsis even under the conditions of apparent depth should be difficult to see in completely binocular stabilization. The precise degree of static stereograms, i.e. with sustained presen- stabilization needs to be verified independently tation. Such a result was noted by Blakemore before these experiments can be definitively and Julesz (1971), in the course of a depth interpreted. adaptation study. While adapting, they reported that the depth soon faded out, and deliberate The role of position and motion sensitivity in eye movements (dynamic retinal stimulation) stereopsis were required to restore the depth impression. Another way to look at the dynamics of No further adaptation experiments with control stereopsis is to measure the detection threshold

cHlusTo~w.TYLER

1888

A

6 +5min

Zero static

- 5min static

disparity

I

I

I

I

I

I

I

I

2-I

o-2

o-4

0.6

l-6

3.2

.I

0.2

Frequency,

I

o-4

I

06

I

l-6

I

32

Hz

Fig. 6. Lateral and stereomotion thresholds as a function of temporal frequency, from Regan and Ekverley (1973a-c). (A) In the zero-disparity fixation plane both monacular (A) and binocular (0) lateral motion thresholds were velocity-limited at low frequencies. Stereomotion thrasholds showed positionlimited behavior of frequency independence. (B) When the motion was around a point at + 5 min static disparity both binocular lateral (a) and stereo depth motion thresholds (m) showed velocity-limited behavior below about 1 Hz. (C) A velocity limit similarly governed thresholds at - 5 min static disparity.

for sinusoidal stereomotion in depth as a function of temporal frequency. If stereopsis were totally mediated by changes of disparity, then the perception of sinusoidal stereomotion should be dominated by a velocity profile, as described for motion stimuli (Tyler & Torres, 1972; Nakayama & Tyler, 1978). An initial application of this paradigm was presented by Tyler (1971), where the sensitivity profile for oscillating depth motion in a fovea1 line with a stationary reference at 20min separation showed a velocity limitation similar to that for lateral motion. Regan and Beverley (1973a) explored the disparity tuning of the low-frequency properties of depth motion of a 2deg line relative to a randomdot fixation plane. When the line had a disparity of 5 min in front of, or behind, the plane of fixation, the depth oscillation thresholds showed a velocity profile, in accord with the LivingstonoHubel hypothesis that stereopsis is mediated by the M-system and controlled by motion information (Fig. 6 C and D). However, when the depth oscillations were in the plane of fixation, thresholds at low temporal frequencies improved dramatically (whereas those above 1 Hz were unaffected). The result was a position-sensitive sensitivity profile, with oscillation thresholds independent

of temporal frequency up to 1 Hz. On the other hand, lateral motion (both monocular and binocular) retained a velocity profile, even when this motion was within the plane of fixation containing random dots which could act as an adjacent reference (Fig. 6A). Thus, for depth motion near the fixation plane, a position-sensitive stereoscopic system is required to account for the data. These results point to the presence of P-system processing of near-horopteral stereomotion (but not lateral motion). The low frequency sensitivity loss in lateral motion under conditions where sensitivity remains high for stereomotion (Fig. 6A) has interesting implications for the information flow. If monocular motion processing preceded binocular combination, the information lost prior to motion processing could not be regained for disparity processing. Therefore it must be the case that full position information is retained up to the site of binocular combination, but that the monocular motion mechanism is unable to utilize it in the absence of a reference in the other eye. When a reference (in the form of the counterphase motion) is present in the other eye, the position information is available for use by the disparity-processing mechanism. This suggests that the monocular

Visual processing streams

1889

and binocular mechanisms operate indepen- so&I disparity gratings was therefore perdently on the retinal information, rather than formed, in which the degree of stimulus the monocular one preceding the binocular one. transience was varied by varying the duration of (The opposite sequence is excluded by the presentation from uniform flat background. stereomovement suppression effect; Tyler, 1971, Based on the slow apparent response time of which indicates that the binocular mechanism the cyclopean disparity mechanism described cannot precede the monocular one, since the above, four durations were selected of 50, 100, binocular threshold is much higher than its 200 and 1OOOmsec. The cyclopean stimuli were sinu~idal dismonocular ~unte~rt.) parity m~ulations in dynamic random-dot dis~patiotempor~~ properties of global stereopsis plays, producing the impression of horizontal A further implication of the segregation of depth corrugations of a field of twinkling dots. local and global stereopsis into separate ana- Note that there was no coherent modulation of tomical processing streams is that stimuli pro- the luminance or contrast in these stimuli, which cessed by global stereoscopic mechanisms were generated in red/green anaglyphic form should then exhibit the properties of a single through a projection TV by means of a PDP stream. Specifically, if global stereopsis were 1l/20 computer. Tbe disparity modulation itself mediated by the parvo stream, then it should was produced by direct voltage control to the not contain components attributable to the raster of the G-gun, and hence could be reduced activity of the magno stream. For example the to ind~nitely low amplitudes without being spatial response properties should be indepen- limited by the sire of the matrix elements. All dent of the stimulus time course; operationally, modulation presentations were invisible in the spatial-frequency tuning of stimuhrs detec- monocular view, by virtue of the dynamic noise tability should be independent of the temporal field. The test area subtended 20 deg vertical by envelope of the stimulus presentation (the prop 30 deg horizontal. The observer’s task was a forced-choice erty of spatiotemporal separability). It is well known that stimuli in the ltinance decision of the presence or absence of the depth domain do not exhibit such separability; corrugations. The corrugation disparity ampliRobson (1966) showed that the spatial fre- tude was varied on each trial by means of a quency tuning of sensitivity to a grating patch probabilistic staircase procedure (67% up/33% varied substantially with the temporal modu- down) designed to mimic the conventional 2 lation frequency. One explanation for this sepa- up/l down staircase but to eliminate any rability failure is the presence of more than one sequential dependencies in the sequence. The detection mechanism with different spatial and first 20 trials were ignored to allow the staircase temporal response properties. A demonstration to asymptote, and threshold was defined as the of separability would rule out the operation of average corrugation amplitude of the remaining multiple mechanisms determining threshold, as 40 trials in each sequence. well as other explanations of nonseparability The results are depicted in Fig. 7A in terms of within a single mechanism. Thus, if it could be cyclopean disparity modulation sensitivity, the shown that cyclopean RDS stimuli exhibited reciprocal of the average peak-to-peak moduspatiotemporal separability, it would be a lation amplitude at threshold. The functions strong argument that only one (global) channel show a progressive increase from the lowest was available to process these stimuli. cyclopean corrugation frequencies up to vahtes A simple test of this question is to measure in the range of 0.4-l .Oc/deg ( -0.4-O log c/deg). the spatial frequency tuning for detection of Note that this peak is an order of magnitude sinusoidal disparity modulation for stimulus lower in spatial frequency than typical values presentations of various durations. Short for luminance modulation. Beyond the peak duration presentations are sufficient to maxi- region, sensitivity drops steeply to a maximally stimulate a transient mechanism, and mum measurable value of about 4 c/deg should stimulate a slow-response sustained (0.6 log c/deg). mechanism only weakly. Long duration stimuli, The main point of the present study is to on the other hand, should maximally stimulate demonstrate that the shape of the curves is a sustained mechanism, with no further im- constant within experimental error from provement in the transient mechanism response. 50 msec to 1 see presentation durations. This is An experiment on cyclopean detection of sinu- emphasized by the plot in Fig. 7B of the increase VR 30/I&W

w. TYLER

CHlusToPHER

1890

1

set

e _y

0.2 c/deg 0.4c/deg 1 c/deg

v

2c/deg 2.8 c/deg

-

4 c/deg

200 msec 100 msec 50 msec

1

0 Log Spallal Frequency of Disparity Modulation

1

1

2

Log Duration (msec)

Wckg)

Fig. 7. Cyclopean depth modulation sensitivity in dynamic visual noise at four stimulus durations. The cyclopean modulation was presented between periods of flat planes in the dynamic noise. (A) The reciprocal of log peak-to-peak amplitude threshold is plotted as a function of cyclopean spatial frequency for the presentation durations indicated in the key. (B) Cyclopean depth modulation sensitivity as a function of presentation duration for a range of spatial frequencies. Oblique line corresponds to the predicted slope for probability summation over time on the assumption of a linear detection process with time-independent additive noise.

in sensitivity with exposure duration. Sample curves across the range of corrugation frequencies are asymptotic to a slope of 0.5 (straight line) in a similar way at all corrugation frequencies, indicating no great deviation from a constant shape. The low corrugation frequency data (open symbols) do not show a significant departure from the form of high corrugation frequency ones (filled symbols). These data give no support to the possibility that there might be two cyclopean mechanisms with different spatiotemporal tuning properties. The slope of 0.5 may be interpreted to correspond to probability summation over time under the assumption that the detection is limited by noise which is independent of time, and that the detection process is a linear integration process over the whole stimulus (Gorea & Tyler, 1986). This integration process operates to add the same amount of signal for each increment of duration (as the stimulus is prolonged in time). The noise is similarly incremented but its amplitude increases as the square root of time because noise variances add, but the mean deviation increases with the square root of the variance. A similar result of

increased detectability with a slope of 0.5 was obtained as a function of area of cyclopean stimulation by Tyler and Julesz (1980), corresponding to probability summation with linear integration over space. These results suggest that the detection process for cyclopean events is linear (i.e. jJ = l), and differs from that for contrast detection in having no threshold nonlinearity. At short durations the slopes at all cyclopean corrugation frequencies become steeper, perhaps approaching a zone of full summation corresponding to a temporal integration time of about 100 msec for the noise itself. This interpretation would imply that the processing time of the cyclopean mechanisms was about 1OOmsecfor the stimulus conditions here, so that shorter duration stimuli and the precyclopean noise are all integrated over this duration before being available to perception. The lack of cyclopean stereopsis for purely chromatic stimuli

All studies are in agreement that fine, cyclopean stereopsis is strongly degraded at luminance (e.g. Lu & Fender, 1972; de Weert,

3

Visual processing streams

1979). Later evidence that fine stereopsis may be supportable by purely chromatic ‘differences (de Weert & Sadza, 1983) is suspect for two reasons. First, the size of the random elements in that study was 3.6min. At 50% density this means that there would be a high probability of many areas 10 min or more in width present in both the center and surround regions. This is comparable size with the “figural” or non-cyclopean stimulus used as a comparison; one would not expect a difference in performance for the two equiluminance tasks when the element sizes were comparable. Second, the stimuli were presented on video monitors, and no evaluation of the luminance edge artifacts, either from the video rise time or from ocular chromatic aberration, was presented. As recognized by the authors, it is thus possible that the detection performance with chromatic stimuli reported by de Weert & Sadza (1983) was mediated by spurious luminance contours. For both reasons, their data should not be regarded as a refutation of the previous, well-controlled demonstrations that stereopsis is abolished for purely chromatic cyclopean stimuli. . The existence of chromatic stereomovement The concept of a sustained, chromatic Psystem and a transient, stereoscopic M-system (Livingstone & Hubel, 1987) implies that not only should stereomotion be a strong percept because it amounts to a transient stimulus, but that conversely it should be much weakened with purely chromatic disparity stimuli, even in noncyclopean figural displays. Even adding the interblob/interstripe parvo system of DeYoe and van Essen (1988) would not permit the combination of chromatic and motion information. However, since stereopsis is often much enhanced by motion cues (Tyler & Cavanagh, 1989) examined the detectability of purely chromatic stereomotion in red/green sinusoidal gratings. We found that it was not only a robust percept, but that under some conditions it was as well perceived as luminance stereomotion, and moreover exhibited different functional properties. This difference in functional properties gave a clear assurance that chromatic stereomotion processing was not just a degraded form of that for luminance stereomotion, but an independent perceptual system. We first did experiments to ensure equiluminance of oscillating red/green gratings when viewed monocularly. Oscillation thresholds were measured at the adjusted equiluminance

1891

point and with various degrees of added luminance contrast. The chromatic gratings were generated by superimposing red and green sinusoidal modulation 180 deg out of phase. Chrominance modulation was defined as the percentage of the maximum change possible between the red and green phosphors, which is 46% of the maximum chromaticity modulation that can be theoretically obtained on the CIE color triangle. The red/green gratings used in these experiments had a chrominance modulation of 70% of our range, corresponding to 32% of average modulation of the R and G cone classes. Figure 8A shows that the threshold oscillation amplitude was greatest close to the adjusted equiluminance point, and improved symmetrically on either side. The other assurance of equiluminance is given by the fact that the depth movement properties reported in the remainder of the figure changed dramatically at the equiluminance point. Figure 8B shows the temporal frequency tunings for monocular lateral and depth movement with luminance gratings of 10% contrast. Note that both functions have dramatic elevation of oscillation phase threshold as temporal frequency was reduced below 3 Hz, and that detection of stereomotion required 2-3 times the motion amplitude (expressed in monocular amplitudes) as did the monocular presentation. Thus adding extra information in the second eye made it more difficult to see the motion-the stereomotion suppression effect first reported in Tyler (1971). The data for purely chromatic stereomotion had quite different properties (Fig. 8C). At high temporal frequencies the monocular chromatic motion thresholds were elevated by about a factor of five relative to the monocular luminance motion thresholds in Fig. 8B. On the other hand there was no difference between monocular and stereomotion thresholds in the chromatic condition (Fig. 8C), so that the stereomotion suppression seen in Fig. 8B was absent. In addition, there was almost no low frequency threshold elevation for chromatic stereomotion, so that stereomotion thresholds became equal at 0.5 Hz for the two types of stimulation. We thus conclude that purely chromatic motion processing lacks the type of inhibitory mechanism necessary to produce an inhibitory fall-off at low frequencies. Interestingly, for chromatic stimuli the visual system is as sensitive to depth motion as to lateral motion, which makes it relatively more sensitive

1892

*--20

-10

0

10

20

-0.3

~rnin~~Co~&~t~)

-10 10 0 LumirunwCciWast(%)

0.1

1.0 10 Tem~mlFr~ue~(Hz)

100

1.0 10 TemporalFrequency

100

0.1

l.0

10

100

TemporatFreciuency(Hz)

0.1

1.0

10

Temporal Frequency

100 (Hz)

Fig. 8. Properties of purely chromatic stereomovemcnt pemeptioa. (A) Determination of the equiluminance points for the present stimuli for each observer. Small amounts of spatial luminance modulation were added around the equihrminance point determined by het~~h~~tic &ker pho~me~. The peak vahies from these data were used for the ~ning experiments. a-Monocuhu viewing (lateral motion); ~--binocular viewing (stereomotion). (B) MonocuIar (0) and stereoscopic (m) thresholds for a 2 c/deg ltinance grating of 10% contrast as a function of temporal oscillation frequency, for two observers. Both types of threshold are plotted in terms of the monocular amplitude of oscillation (i.e. no binocular summation is presumed). Amplitudes are given in terms of degrees of phase angle of the sinusoidal cycle. (c) Monocular (0) and stereoscopic depth thresholds (I) for purely chromatic red/green gratings with the same spatial cot@uatioa as the luminance gratings in (B). Note similarity of monocular and stereoscopic thresholds, and the lack of significant threshold elevation in the low frequency range from 0.5 to 3 Hz.

to depth motion in comparison with the sensitivity ratio for luminance stimuli. Note that this analysis is cast in terms of psychophysically-defined processes. Additional complexity is introduced when the processes are analyzed with respect to the receptivefield or~n~tion that might underlie them. For example, Ingling and Drum (1973) pointed out that cells with opponent chromatic arrangement of adjacent regions (R+/Gor R- /G+) would show a low-pass spatial frequency response to a purely chromatic gratings, even though the same cells had wnn~tions to the exci~to~~i~bito~ exhibit a bandpass response to luminance gratings. The wiring connections are such as to pass on only noninhibitory information about purely-chromatic stimuli to higher analyzers, such as the motion-processing system. In this sense, such cells lack an inhibitory organization when functioning as purelychromatic stimulus processors, although the neu~physiolo~ reveals the operation of inhibition when the stimuli have a luminance variation.

Consequently, achromatic cells with combined cone inputs into both excitatory and inhibitory regions (R-t G i- /R - G -) should show a bandpass response to luminance gratings but no response to all to equiluminant chromatic gratings. Thus, it is the overall response to lminance variations that is a ~ntaminated mixture of achromatic and chromatic cell outputs; the equihuninant stimulus not only isolates the chromatic opponent cells, but also isolates the color-selective aspect of the response of those cells. The data of Fig. 8 thus validate the concept of an inde~ndent mechanism for purely chromatic stereomovement, and its identification with the P-system. Further experiments are required to partition its location between the PB or PI components of the P systems. Depth tilt from difskrence

binocular spatial-frequency

A further stereoscopic ~pability, which needs a separate substrate from those described so far, is the perception of depth tilt from interocular spatial-frequency differences between the two

Visualprocessingstreams eyes’ images. (The term “interocular spatialfrequency difference” is so unwieldly that it will be replaced by the term “diffrequency”, by analogy with “disparity” for positional differences between the two eyes.) If there were a specialized neural system for the detection of diffrequency tilt, it would have the advantage of allowing this feature to be processed with far simpler neural machinery than is required for conventional disparity processing. Instead of the precise receptive field and eye alignment required for conventional disparity processing, diffrequency requires only a response to the difference in size of stimuli in the same general retinal region. The size processing can operate with retinal images and receptive fields of much coarser resolution than required for disparity processing. It is therefore of interest to see whether a depth system of this type exists in the visual system, and what are its processing characteristics. The first suggestion that diffrequency might be a specialized cue for stereopsis was made by Blakemore (1970). He used vertical sinusoidal gratings in the frontal plane as stimuli and found that binocular diffrequencies gave rise to a horizontal depth tilt up to a ratio of 1.4 between the frequencies in the two eyes. He found that the depth tilt survived a difference in drift velocities between the eyes, and suggested that it might be processed by a different neural system from that for conventional disparity. The suggestion was taken up by Tyler and Sutter (1979), who designed several stimulus paradigms to ensure that conventional disparity cues could be excluded. The simplest of these consisted of sinusoidal grating stimuli moving in opposite directions in the two eyes. When the velocity was slow, conventional disparity cues gave the perception of a flat plane moving towards the observer. Each time the gratings had moved the distance of a complete cycle, the plane flipped back and then continued to move towards the observer again. As the velocity was increased, a point was reached where the depth movement failed, and was replaced by a perception of rivalry between the two monocularly moving images. At this point (a velocity of 4 deg/sec), conventional disparity cues must have been ineffective, since they no longer gave rise to the disparity-appropriate perception of depth motion. (This test was not applied by Blakemore, 1970, in his original studies of the tilt effect.)

1893

If the spatial frequency in one eye was now changed by a small amount, the observers reported that the plane was simultaneously rivalrous and tilted into depth, although the depth movement from conventional disparity cue had been abolished. Thus, motion rivalry between the eyes was maintained, superimposed on the depth tilt attributable to the diffrequency between the two eyes. Depth tilts were detectable up to diffrequency ratios of 2: 1, without perceptible depth movement. Considering that all three percepts are derived from narrow-band stimuli, these results clearly demonstrate that diffrequency is a separate depth cue from disparity. Previous reports of simultaneous rivalry and depth percepts (e.g. Julesz & Miller, 1975; Levinson & Blake, 1979) have involved different bands of spatial frequencies or orientations for the fused and rivalrous components of the stimulus, and thus do not imply that a single mechanism can simultaneously exhibit fusion and rivalry. In further studies of diffrequency depth tilt, Halpem, Patterson and Blake (1987) concluded that depth tilt for sinusoidal gratings with diffrequency ratios up to 1.2: 1 was encoded by a positional disparity mechanism, but accepted that a separate mechanism might account for depth tilt perception at larger ratios. However, their experimental design was flawed and the results self-contradictory. The main approach was to introduce a uniform positional disparity into the depth tilt stimulus, and to argue that tilt perception should not be degraded if it were mediated by diffrequency information alone. When the stimulus was an “edgeless” patch of grating in each eye, the results showed that tilt perception was completely independent of the positional disparity, which would be fully consistent with a diffrequency mechanism. Introduction of sharp contrast edges in the form of a small (3 deg) circular window produced significant, but incomplete, masking of the depth tilt as positional disparity increased. This was their main evidence against a diffrequency mechanism, but their own discussion attributed the loss to masking by the diplopic or rivalrous unmatched contours at the edges. Such masking could occur whatever mechanism was processing the diffrequency information, so the experiment has no bearing on the question posed in the title of their paper: “What causes tilt from spatial frequency disparity?“.

Crrmsrur-u~~ W. Tm

1894

In terms of the processing streams, the system processing depth tilt from interocular size differences can be distinguished from the fine disparity system because the former processes up to much larger diffrequency ratios, and distin~ished from the coarse system on the basis that it occurs beyond the point of collapse of stereomotion {Tyler & Sutter, 1979).If coarse stereopsis is processed by the M-system and fine stereopsis by the PI-system, then by elimination that would leave the PB system as a possible substrate for a separate pathway for diffrequency processing. This identification would make sense in terms of spa~al-fr~uency specificity, which is low for both systems (Blakemore, 1970).It would also predict that diffrequency tilt should be unaffected by presentation in isoluminant chromatic gratings, unlike most other perceptual tasks.

CONCLUSION This analysis suggests that binocular stereopsis, rather than being restricted to a single processing stream, is itself subdivided into an array of different processes throughout the geniculostriate pathway. Tentative identifications are proposed for coarse stereopsis and stereomotion in the M-system for motion and transient information [magna > interblob (layer 4B) stream]; fine, global stereopsis in the PI-system for high spatial frequency, static information (parvo > interblob stream); and nonclassical coarse diffrequency processing for depth tilt in the PB-system for chromatic and low spatial frequency info~ation (parvo > blob stream) (Fig. 1). I first suggested the view of stereopsis as consisting of multiple, parallel subsystems (Tyler, 1983)on the basis of psychophysical evidence alone. Now that the functional physiological segregation of information processing in primates is becoming better understood, it is possible to test proposed ass~ations between the separable processes in the two domains. It is hoped that the analysis proposed here can act as a guide to such tests, in both the physiological and psychophysical studies of the stereoscopic processes. Acknowledgements-This work was supported by NIH grants lP30 EY 6883 and RR 5981 and SKERI grant 3.54. The cyclopean duration experiment (Fig. 7) was conducted in collaboration with Bela Julesz. My thanks to the anonymous reviewers for detailed comments on a previous draft.

REFERENCES Blakemore, C. (1970). A new kind of stereopic vision. Vision Reseurch, IO, 1181-1200. Blakemore, C. & Julesz, B. (1971). Sterescopic depth aftereffectproduad without monocularcues.Science, 171, 286-288. DeYoe, E. A. & Van Essen, D. C. (1988). Concurrent processing streams in monkey visual cortex. Trends in Neuroscience, II, 219-226. Felton, T. B., Richards, W. & Smith, R. A. (1972). Disparity processing of spatial frequencies in man. Journal of Physiology,Lo&n, 225, 349. Fender, D. 8r Julesx, B. (1967). Extension of Panum’s fusional area in binocularly stabilized vision. Journal of the Optical Sociefy of America, 57, 819-830. Gorea, A. & Tyler, C. W. (1984). New look at Bloch’s law for contrast. Journal of the Optical Society uf America, A3, 52-61. Halpern, D. L., Patterson, R. 8 Blake, R. (1987). What causes stereoscopic tilt from spatial frequency disparity? Vision Research, 27, 1619-1629. Hubel, D. H. & Livingstone, M. (1987). Segregation of form, color and stereopsis in primate area 18. Journal of Neuroscience. 7, 3378-3418. Ingling, C. R. 8r Drum, 3. A. (1973). Retinal receptive fields: Correlations between psychophysics and electrophysiology. Vision Research, 13, 1151-1163. Jones, R. (1977). Anomalies of disparity detection in the human visual system. Journal of Physiology,London, 264, 621-640. Julesz, B. (1962). Towards the automation of binocular depth perception (AUTOMAP-I). In Popplewell, C. hf. (Ed.), Proeee&tgs of IFZPS. Amsterdam: North-Holland. Julesa, B. (1971). Fo~~tjo~ of cycIopean perception. Chicago: University of Chicago Press. Julesx, B. (1978). Global stereopsis: Cooperative phenomena in stereoscopic depth perception. In Held, R., Leibowitx, H. W. & Teuber, H.-L. (Eds.), Hrmdbook of sensoryphysiology, Vol. VII: Perception. Berlin: Springer. Julesx, B. & Miller, J. (1975). Independent spatialfrequency-tuned channels in binocular fusion and rivalry. Perception, 4, 125-143. Julesz, B. BcTyler, C. W. (1976). Neurontropy, an entropylike measure of neural correlation in binocular fusion and rivalry. 3iological Cybernetics, 22, 107-109. Julesz, B., Kropfl, W. J. & Petrig, B. (1980). Large evokedpotentials to dynamic random-dot correlogams and stereograms permit quick determination of stereopsis. Proceedings of rhe National Academy of Science, 77, 2348-2351. Langlands, N. M. S. (1926). Experiments on binocular vision. Tr~c~~~ of the OpticatSociety, 28, 230-238. Lehmann, D., Skrandies, W. & ~nd~rn~er, C. (1978). Sustained cortical potentials evoked in humans by binocularly correlated, uncorrelated and disparate dynamic random-dot stimuli. Neuroscience Letters, 10, 129-134. Levinson, E. & Blake, R. (1979). Stereopsis by harmonic analysis. Vision Research, 19, 73-78. Livingstone, hf. S. & Hubel, D. H. (1987). Psychophysical evidence for separate channels for the perception of form, color, movement and depth. Journal of Neuroscience, 7, 3416-3468. Lu, C. & Fender, D. H. (1972). The interaction of color and luminance in stereoscopic vision. InvestigativeOphfhulmology, I I, 482-489.

Visual procedsing

streams

1895

Marr, D. & Poggio, T. (1976). Cooperative computation of Regan, D. M., Erkelens, C. J. & Collewijn, H. (1986). Necemary conditions for the perception of motion in stereo disparity. Science, lY4, 283-287. depth. Investigatioe Ophthabnology and Visual Science, 27, Marr, D. & Poggio, T. (1979). A theory of human stereopsis. 584-597. Proceedings of the Royal Society, London, Series B, 204, Richards, W. (1972). Response functions for sine- and 301-328. square-wave modulations of disparity. Journal of the Mitchell, D. E. & Baker, A. G. (1973). Stereoscopic aftereffects: Evidence for disparity specific neurons in the , Optical Society of America, 62, 907-911. Robson, J. G. (1966). Spatial and temporal contrast sensihuman visual system. Vision Research, 13, 2273-2288. tivity functions of the visual system. Journal of the Optical Nakayama, K. & Tyler, C. W. (1978). Relative Society of America, 56, 1141-I 142. motion between stationary lines. Vision Research, 18, 1663-1668. Schiller, P. H., Charles, E. R. & Logothetis, N. K. (1988). The effect of V4 and parvocelhzlar lesions on primate Nakayama, K. & Tyler, C. W. (1981). Psychophysical isolation of movement sensitivity by removal of familiar vision. Inoestigatioe Ophthalmology and Visuai Science (SuppI.), 29, 328. position cues. Vision Research, 21, 427-433. Nelson, J. I. (1975). Globality and stereoscopic fusion 8&&r, P. H., Logothetis, N. K. & Charles, E. R. (1999). in binocular vision. Journal of Theoretical Biology, 49, Functions of color-opponent and broad-band channels of l-88. the visual system. Nature, London, 343, 68-70. Norcia, A. M. & Tyler, C. W. (1984). Temporal frequency Schor, C. M. & Wood, I. (1983). Disparity range for local limits for stereoscopic apparent motion processes. Vision stereopsis as a function of luminance spatial frequency. Research, 24, 395-401. Vision Research, 23, 1649-1654. Norcia, A. M., Sutter, E. E. & Tyler, C. W. (1985). Schor, C. M., Wood, I. & Ogawa, J. (1984). Spatial tuning Electrophysiological evidence for the existence of coarse of static and dynamic local stereopsis. Vision Research, and fine disparity mechanisms in human vision. Vision 24, 573-578. Research, 25, 1603-1611. Schumer, R. A. (1979). Mechanisms in human stereopsis. Ogle, K. N. (1950). Researches in binocular vision. PhiladelThesis, Stanford University. phia: Saunders. Shimojo. S. & Nakajima, Y. (1981). Adaptation to the Ogle, K. N. (1962). Spatial localization through binocular _ reversal of binocular depth cues: Effects of wearing vision. In Davson, H. (Ed.), The eye, Vol. 4, Visual optics left-right reversing spectacles on stemscopic depth percepand the optical space sense (Chap. 15). New York: tion. Perception, 10, 391-402. Academic Press. Tyler, C. W. (1971). Stemosco pit depth movement: Two Ogle, K. N. & Weil, M. P. (1958). Stereoscopic vision and eyes less sensitive than one. Science, 174, 958-961. the duration of the stimulus. Archioes of Ophthabnology, Tyler, C. W. (1974). Depth perception in disparity gratings. 59, 4-17. Nature, London, 251, 140-142. Panum, P. L. (1858). Sehen mit zwei Augen. Kiel: SchwerssTyler, C. W. (1975a). characteristics of stereomovement the Buchhandlung. suppression. Perception and Psychophysics, 17, 225-230. Parker, A. J. & Yang, Y. (1989). Spatial properties of Tyler, C. W. (1975b). Observations on binocular spatial disparity pooling in human stereo vision. Vision Reseurch, frequency reduction in random noise. IPerception, 4, 29, 1525-1538. 305-309. Piantanida, T. P. (1986). Stereo hysteresis revisited. Vision Tyler, C. W. (1977). Stereomovement from interocular delay Research, 26, 43 l-436. in dynamic visual noise: A random spatial disparity Poggio, G. & Fischer, B. (1977). Binocular interaction and hypothesis. American Journal of Optometry, 54,374-386. depth sensitivity in striate and pre-striate cortex of behavTyler, C. W. (1983). Sensory processing of binocular dising rhesus monkeys. Journal of Neurophysiology, 40, parity. In Schor, C. M. & CiulTreda. K. J. (Eds.), Vergence 1392-1407. eye movements: Basic and clbdcai aspects. London: Poggio, G. Br Talbot, W. H. (1981). Mechanisms of static Butterworths. and dynamic stereopsis in foveal cortex of the rhesus Tyler, C. W. & Cavanagh, P. (1989). Purely chromatic monkey. Journal of Physiology, London, 315, 469-492. stereomotion perception. bsvestigatice Ophthaimology and Poggio, G., Gonzalez., F. & Krause, F. (1988). Stereoscopic Visuai Science (Suppl.), 30, 324. mechanisms on monkey visual cortex: Binocular correTyler, C. W. & Jtdesx, B. (1980). On the depth of the lation and disparity selectivity. Journal of Neuroscience, 8, cyclopean retina. Experimental Brain Research, 40, 4531-4550. 196-202. Regan, D. (1972). Evoked potentials in psychology, sensory Tyler, C. W. & Sutter, E. E. (1979). Depth from spatial physiology and medicine. London: Chapman & Hall. frequency difference: An old kind of stereopsis? Vision Regan, D. & Beverley, K. I. (1973a). Some dynamic features Research, 19, 859-865. of depth perception. Vision Research, 13, 2369-2379. Tyler, C. W. & Tones, J. (1972). Frequency response Regan, D. & Beverley, K. I. (1973b). The dissociation of characteristics for sinusoidal movement in the fovea and sideways movement from movements in depth: Psychoperiphery. Perception and Psychophysics, 12(B), 232-236. physics. Vision Research, 13, 2403-2415. Weert, C. M. M. de (1979). Colou contours and stereopsis. Regan, D. & Beverley. K. I. (1973~). Disparity detectors in Vision Research, 19, 555-564. human depth perception: Evidence for directional selecWeert, C. M. M. de & Sax& K. J. (1983). New data tivity. Science, 181, 877-879. concerning the contribution of colour differences to Regan, D. & Spekreijse, H. (1970). Electrophysiological stereopsis. In Mollon, J. D. & Sharpe, L. T. (Eds.), Colour correlate of binocular depth perception in man. Nature, vision. Physiology and psychophysics (pp. 553-562). London, 255, 92-94. London: Academic Press.

A stereoscopic view of visual processing streams.

Recent anatomical and physiological studies of the visual pathway suggest the existence of at least three parallel processing streams in the lateral g...
2MB Sizes 0 Downloads 0 Views