Vision Ret. Vol. 30, No. 2, pp. 303-316, 1990 Printed in Great Britain. All rights rcscrvcd
Copyright0 1990 Pcrgamon Pmss pk
DISCRIMINATION FOR BAND-PASS RANDOM DOT KINEMATOGRAMS ROBERT CLEARY+
and OLIVER J. BRADDICK
Department of Experimental Psychology, University of Cambridge, Downing Street, Cambridge CB2 3EB, England (Received 11 November 1988; received for publication 19 June 1989) Abstract-When an array of random dots is displaced, the ability to report the direction of apparent motion is subject to an upper spatial limit (d_). As the size of the displacement is increased, direction discrimination errors show a monotonic increase that becomes asymptotic at a chance level. We have measured direction discrimination using spatially band-pass filtered random dots. These stimuli do not yield a monotonic increase in errors. Rather, for displacements greater than around 1 cycle of the stimulus centn frequency (F,), performance oscillates about chance, with displacements of lt cycles of F, yielding systematic errors in perceived direction. We analyse this pattern of performance in terms of the stimulus autocorrelation function and conclude that d, can be taken as lying on the initial rising portion of the displacement versus error function. Using this definition we find, in line with the results of Chang and Julesz (1985), that d_ scales inversely with F,. Contrary to the results of Chang and Julesx, we find that this scaling holds beyond 4 c/deg. Random dot kinematogram
frames is seen to imply an ability to establish associations between corresponding elements In its simplest form, the “random dot kinemato- across succesive images. The breakdown of rigid gram” consists of a pair of static random dot motion for large displacments is seen as a images presented in quick succession. The sec- breakdown of the correspondence process, the ond image is the same as the first, except that the sensation of incoherent motion being the result dots have been uniformly displaced in a single of random, inappropriate correspondences. direction. Various studies (Anstis, 1970; Julesz, This phenomenon allows a simple measure of 1971; Bell & Lappin, 1973), have demonstrated d_ to be obtained by increasing the stimulus that a stimulus of this kind can elicit the percep displacement until the direction of the displacetion of a smooth, coherent motion. Braddick ment is no longer discriminable. (1974) showed that the coherence of the apparWith a manipulation of the density of dots in ent motion was critically dependent on the size a kinematogram, Baker and Braddick (1982) of the spatial displacement of the dots between showed that d_ was a spatial limit, rather than the first and second frames. Below an upper a limit imposed by the number of potential limit, termed “&:‘, the rigid motion of the dots matches available to the correspondence prowas reliably perceived. However, for displace- cess. They conclude that d_ “is more conments greater than d_, the stimulus appears as sistently expressed as the retinal angle of a mass of incoherent local movements. This displacement, than as the number of pixels (tradependence on displacement has been inter- versed)” (p. 1258), a result that may be taken as preted in terms of the “correspondence prob- tentative support for the notion that d, reflects lem,” familiar from recent theories of stereopsis receptive field properties of motion detecting (e.g. Marr & Poggio, 1979; Mayhew & Frisby, neurons. This idea has received some further lF81). In this context, the fact that we perceive backing from the fact that d_ is not a constant coherent movement in a sequence of static angular limit. For example, Chang and Julesz (1983, 1985) have demonstrated the spatial frequency dependence of d,. In their 1985 paper *Present address: Department of Psychology, University they used spatially filtered random dots, with a College London, Gower Street, London WClE 6BT. bandwidth of 1 octave, to measure d,, for England. LNTRODUCX’ION
ROBERTCLEAN and OLIVERJ. BRADDICK
Fig. I. (a) A simplified schematic representation of a sub-unit of the elaborated Reichardt motion detector. Signals from two band-pass receptive lb&s, sqarated by a distance (d), are multiplied tog&her o(), one passing through a low-pass temporal &er (L). The organisation illustrated gives a selectivity for rightward motion. (b) A single zero crossing from an impss Marr and Ullman’s (1981) spatio-temporal gradknt detector calculates the sign of the temporal derivative at the zero crossing. If the polarity of the zero crossing is known, then the sign of the temporal derivative defines of the direction of all instantaneous displacements less than the distance (d).
direction discrimination. They showed d_ to be inversely proportional to centre frequency for kinematograms with centre frequencies in the range l-4 c/deg. These data may be usefully related to a recent electrophysiological investigation of direction selective cortical units. Baker and Cynader (1986) correlated the displacement tu&ng and spatial frequency tuning of directionally selective units in area 17 of the cat. Dis@acement tuning was assessed with bar stimuli which were flashed sequentially within the receptive Mds, while the spatial frequency tuning of the cells was determined from line weighing functions. Baker and Cynader show that for each cell there was optimal displacement for directionally selective responses to apparent motion, and there was an inverse scaling of this displacment with the preferred frequency of the cell. While Baker and Cynader measured optimum d-t, rather than a maximum, they were able to estimate an upper limit from their data and found that it scaled well with the optimum. The majority of their cells yielded optimum ments of around d of a cycle, while they estimate the curresponding upper spatial limit to be between two and three tin the optimum. The inverse scaling of d, with spatial frequency is predicted by several current mod&s of the motion sensor. Three of these models (Adeison & Bergen, 1985; van Santea & Sperling, lBs5; Watson & Ahumada, 1~S)describcd a directionally selective sensor (for&g one half of an opponent pair) that is fed by two bandpass receptive fields separated by some distance
d (Fig. la). By delaying the output of one field before comparing it with the other, a directionally selective response can be obtained. To preclude the spatial aliasing that would occur if the detector were sensitive to image componw&s with a spatial period less than 2d, the models propose that the input filters be separated by a distance equivalent to $ of a cycle of the filter’s preferred frequency (van Santcn 8r Spcsting, 1984). This inverse scaling of the detector’s displacement tuning with its spatial frequency tuning implies, for a single corn-t of the preferred frequency, a peak response to ins&ntaneous displacements of a a of a cycle. For that same spatial component, the directional response of the detector neassarily falls to Z&K, for displacements of f a cycle, given that the stimulus has no direction. For di ts greater than f a cycle (but less than 1 cycle), an inverse response will be obtained, and the reverse direction signalled. We can extend this argument to predict a “d,” for a set of such detectors when they are stimulated with an image that has a bstnd-pass characteristic, centered on the preferred frequency of the detector. Under these &cumstances one would expect that, on am, the net directional response of the detectors w&d fall to zero for a displacemen tof4acyGkofthe image’s centre frequency. An alternative model of the motiozi acnso~ comes in the form of Marr and Ulhnan’s (1981) spatio-temporal gradient detector. It dadectJ the direction of motion by calculating the sign of the image’s temporal derivative at points corre-
Fig. 2. (a) A filtered random dot pattern with a bandwidth of I octave and an F, of 12 cycles per image width, (b) A similar image, with the same F,, filtered to a bandwidth of 0.5 octaves. 305
sponding to edges of known polarity (Fig. lb). Marr and Ullman suggest that this process occurs within each of a set of band-pass spatial frequency channels. Within this model d_ could be identified with the distance d marked on Fig. lb, For an ins~n~n~us displa~mcnt of this size, or greater, the relationship between the sign of the temporal derivative and the direction of motion fails to be reliable. The spatial frequency dependence of 4, is also potentially well accounted for in that d is proportional to the space constant of the bandpass filter providing the spatial derivative, and assuming that the filter pass-band is symmetrical, approximates to f a cycle of its preferred frequency. Again, one can take this value to represent the upper spatial limit for a correct directional response, taken on average across a set of such detectors, in the case of a band-pass image centred on the detectors’ preferred frequency. Thus, current models of the motion sensor, as well as Baker and Cynader’s (1986) physiological data, predict the inverse scaling of d_ with spatial frequency found by Chang and Julesz (1985). However, this consensus should be qualified by noting some further aspects of Chang and Jute&s results. Firstly, Chang and Julesz report (for centre frequencies in the range l-4 c/deg) that d, approximates 1 cycle of the centre frequency of the stimulus. Super8cially at least, this seems to be inconsistent with the f cycle limit implied by the models of the motion sensor noted above, and the similar physiological limit described by Baker and Cynader (1986). Secondly, Chang and Julesz report that for centre frequencies above 4 c/deg, the scaling of d,_ with spatial frequency breaks down completely. In conside~ng the apparent discrepancy between the predicted and obtained value for the upper spatial limit, it should be noted that Chang and Julesz used stimuli of suprathreshold contrast. Under these circumstances we cannot assume that the centre frequency of a band-pass stimulus corresponds to the preferred frequency of the mechanisms mediating the discrimination of its direction. Additionally, other psychophysical measures of d,, have produced results consistent with the f cycle prediction: Turano and Pantle (1985) asked subjects to report the duration of motion after-effect, subsequent to stimulation with a sinusoidal grating repeatedly undergoing displacement through a given phase angle. Their
results show that after-effects could be generated with displa~m~~ approaching f a cycle, with peak durations being obtained for shifts somewhat below a i of a cycle, for spatial frequencies of 0.5, 1.0 and 4.0 c/deg. Frequencies greater than 4 c/deg were not tested. Turano and Pantle suggest that their results can be interpreted in terms of the displacement tuning of spatial frequency tuned motion sensors. However, there is a crucial limitation to the use of sinusoidal gratings for the measurement of the upper spatial limit. As the authors note, the strongest conclusion that may be drawn from their results is that the maximum displacement to which a channel is sensitive is not less than f a cycle. This is because, for a periodic stimulus such as a sinusoid, certain ~sp~~rn~ts are non-directional. For example, if a directionally selective sensor was maximally sensitive to displacements of f a cycle of its preferred frequency, this ~nsiti~ty would not be evidenced with a measure such as direction discrimination of after-effect duration, because, for a displacement of this size, a sinusoid provides no directional stimulus. This restriction (which also applies to Nakayama & Silverman’s 1985, measurement of contrast thresholds for direction discrimination of sine gratings in apparent motion) highli~ts the value of using 2D narrow-band stimulirather than fully periodic gratings. These can provide directional stimuli, localised in spatial frequency, with displacements greater than f a cycle. However, in this paper we present data showing that the measurement of d,, in narrow-band images is also problematic. Having analysed the nature of these difficulties, we go on to extend Chang and Julesz’s (1985) measurements and discuss the discrepancies we find between the two data sets. METHODS The images used in the following experiments were generated by using Fourier techniques to apply binary (or “hatbox”) isotropic filters to two-dimensional arrays of random grey levels (white noise). The range of grey level values in these noise images was set so that the output of the filters fell within the I)-bit range of the display system used. The filters are simply de&red in terms of their high and low cutoff frequencies (l$ and F,). Their centre frequency (F,) is defined: F, = (Fk + F&Z.
ROBERTCLEARY and OLIVER J. BRADDICK
Their bandwidth in octaves (B) is given by: B=
Unless otherwise stated, the peak-to-trough Michelson contrast of the filtered images was 50%. Some example images are shown in Fig. 2. Each kinematogram consisted of a pair of images presented in succession on a high resolution raster-scan c.r.t. (Visual Contact VMH 3 10). Each 128 by 128 pixel image was sampled from a larger (256 by 256 pixel) array, generated in advanced of a block of trials and held in an 8-bit digital frame-store (Computer Design and Applications MDP-3B). C.r.t. gamma nonlinearity was measured with a photometer, and was corrected by means of an output lookup table held in the frame store. The display seen by the observer can be thought of as a 128 by 128 window onto the larger stored array. On a given trial the window was initially centred over
a randomly determined point in the stored array, and this 128 by 128 sample formed the first frame of the kinematogram. A given stimulus displacement was achieved by displacing the window by the required number of pixels, either up or down. (If the window crossed the edge of the stored array, it “wrapped around” to the opposite side.) The observers’ task was to report the direction of apparent motion, by pressing one of two buttons. No feedback was given. Each of the two frames of the kinematogram was displayed for 66.6 msec (i.e. four 60 Hz video scans), with no inter-frame interval. The stimulus was both preceded and followed by a blank field of the stimulus mean luminance (35 cd/m*). Immediately prior to the first frame, a central fixation mark was superimposed on the pre-stimulus field for 1 sec. The display was viewed at a distance of 228 cm, and subtended an angle of 44 deg arc. In a block of trials, observers were presented with a set of kinematograms with a constant F, and bandwidth, stimulus displacement being varied. Each displacement tested in a block was presented 20 times within a random ordering. At least two blocks were completed by each subject for each condition and thus all data represent at least 40 observations. All three subjects (R.C., I.H. and a naive subject J.N.) were at least moderately practised psychophysical observers and had normal, or corrected to normal, vision.
------*----‘A\ It f
PSYCHOMETRIC FUNClTONSFOlt NABBOW BAND KINEMATOGRAMS ARE N0N-MON0T0NK
Oirplocemcn~ (min arc)
Fig. 3. (a) Displacement versus error function for an unfiltered random dot pettern. Observer. R.C. Each point represents 40 observations. The horizontal dashed lines mark f 2 deviations from the performance expected by chance on the basis of binomially disuibutod ruponsm. d, is taken to be the di#acement yielding 20% errors on the initial decline in error rate (i.e. around 5 min arc, caicuiated by linear interpolation), while d, is taken as the dispkement yielding 20% errors on the rising portion of th+ graph (i.e. around 25min arc). (b) Equivalent data for random dot stimuli filtered with F,2.66 c/deg and bandwidth 0.5 octaves.
To put the results obtained using spatially filtered kinematograms into context, Fig. 3a presents some direction discrimination data for an ordinary unfiltered kinematogram. For small displacements a lower spatial threshold, termed “d-min”, is apparent. At large displacements (around 20 min arc) the results are character&d by a monotonic increase in error rate, reaching an asymptote of around 50% (chance perfurmante), for displacements greater than 40min arc. This decline from nearperfect to chance performance is taken to reflect the breakdown of coherent apparent motion. This pattern of performance is consistent with that obt&ned in previous studies (e.g. Lappin & Bell, 1976; Baker 8c Braddick, 1982, 19g%,b) and in line with Baker and Braddick, we use &xear interpot-
ation to estimate d_ as the displacement yielding 20% errors. A very different pattern of performance is obtained for a narrow-band filtered kinematogram. Figure 3b present results for a kinematogram filtered so as to have an F, of 2.66 c/deg and a bandwidth of 0.5 octaves. As can be seen, the monotonic function has been replaced by a semi-periodic one. For displacements above 20 min arc, the error first rises to a peak that is considerably above the chance level, before falling again to below 20% and subsequently returning to an approximately chance level for displacements approaching 50 min arc. (Informal observations of still larger displacements suggested that if there were further oscillations in the error rate, they were considerably smaller than the first peak and trough, with little consistency between subjects.) Perhaps the most striking thing about this pattern of results is the presence of error rates above 50%; i.e. for a range of displacements around 30 min arc, the observer tends to perceive motion in the direclmj-
tion opposite to the true displacement. These results raise two questions: firstly, why do we obtain this non-monotonic pattern? Secondly, given that there is more than one distinct range of displacements for which direction discrimination errors are below 20%, is it meaningful to assign a value to d_? Figure 4 presents further results for 0.5 octave kinematograms, showing that the pattern of results in Fig. 3b general&z across observers, and a range of different centre frquencies (1.33 c/deg, 2.66 c/deg, 5.33 c/deg). In this figure, the displacement scale has been expressed in cycles of the stimulus centre frequency, rather than retinal angle. This convention illustrates the way in which the direction discrimination performance scales closely with F,. Discussion The error functions we have obtained are notable in two respects: firstly, they show a pretty precise scaling across spatial frequency;
Meon for 3 obs
, 1 :~x.xL~.x_/-x 0
Fig. 4. (a) Direction discrimination results for observer R.C.. for three vaks of F, (V: 1.33 c/deg; x : 2.66 c/&g; 0: 5.33 c/d@. The displacement axis is upreseed in terms of cycks of F,. The close registration of the 3 functions demonstrates the inverse scaliog of direction discrimination performance with spatial frequency. (b) and (c) Equivalent results for obscrv~ I.H. and J.N. rrrpectivcly. (d) The mean function for the three subjects and three values of F, prcaeatalin (u).(b) and (c).
ROBERT CLEMYand OLIVER J.
secondly, they have a quasi-periodic form. Taken together, these properties suggest that the results may have been influenced by the quasiperiodic nature of the narrow-band images, If a truly periodic image is displaced instantaneously, neither the direction, nor the extent of the displacement is uniquely specified by the stimulus. Under such circumstances, errors of direction discrimination are inevitable for certain displacements (this phenomenon may be familiar as the “wagon wheel” illusion of tine films). Whilst it is true that any instantaneous displacement in a narrow-band image does uniquely specify a particular direction and extent, the existence of an upper limit on the correspondence process means that systematic errors in perceived direction could occur: for displacements greater than the upper limit, the true correspondence between features in the first and second frames could not be detected. Under such circumstances, the quasi-periodic structure of the images might result in systematic mismatches being made between similar features in the two frames. Such false correspondences could form the basis of a coherent motion percept, different in extent and/or direction to that specified by the true displacement. With this point in mind, it is reasonable to ask whether the error functions we have obtained with quasi-periodic images are merely an approximation of the kind of error function one would obtain with a purely periodic image. An examination of the mean data presented in Fig. 4d shows that this is not the case. In the case of a truly periodic image one would expect perceptual reversals for displacements of a of a cycle and 1: cycles. However, Fig. 4d shows that these displacements yield better than chance performance. Rather, it is displacements around
1: cycles of the centre frequency that give a reversed motion percept, again opposing the prediction for a periodic image. In making these comparisons we are assuming that F, is an appropriate measure of the quasi-periodicity of the narrow-band stimuli. While this is a useful shorthand, a more complete measure of periodicity can be gained by considering the stimulus autocorrelation function (ACF). The ACF for a two-dimensional image is itself a two-dimensional function. However, in the case of isotropic noise it is circiilarly symmetrical. The one-dimensional ACF for displacements along the y axis, for a discretely sampled image Z(x,y) of size n by n is given by: ACF(j) = i
+j) 1,2, 3,. 1.];
where, if y +j > n,
Z(x, y +j) = I(x, y + j -n).
The ACF is given alternatively by the Fourier transform of the image power spectrum. The latter method was used to calculate the ACFs presented here. The ACF for,the 0.5 octave stimuli is shown in Fig. 5. This function can be thought of as the cross-correlation of the first and second frames of a kinematogram, with the offset between the frames along the horizontal, and the correlation between the frames along the vertical. For example, displacements of around 1 cycle of the image centre frequency result in a correlation of around 20% between the frames. We have suggested that where a quasiperiodic image is displaced through a distance greater than the upper spatial limit of the motion sensor, it is likely that alternatives to the true matches will be available. The ACF of our narrow-band images allows us to estimate the level of correlation, a measure of the quality of any potential match. We take a correspondence or match to be represented by a peak in the ACF of the stimulus image. The true correspondence is represented by the single point of 100% correlation; however, the other peaks represent displacements for which the features (e.g. peaks, troughs, zero-crossings) of the first frame of the stimulus tend to coincide with similar features in I 1 the second. In a situation where more than one -0.1 4.6 -1.88 0 1.1 1.n potential match falls within the spatial range of Offoet (qolu of lo) Fig. 5. ‘I% discrote autocorrehtion function (ACF) for a the motion sensors stimulated by the images, it filtered random dot pattern with a bandwidth of0.5 octaves. seems reasonable to suppose that the “better”
overall match (the one yielding a higher correlation) would dominate the perceived direction. Above, we noted that if performance is a non-monotonic function of displacement, there may be a problem in assigning a single value to In Fig. 4d it seems plausible to locate d,, d o%e initial rise in errors, around 1 cycle of F,, but can we justify this assertion when direction discrimination is also maintained for displacements approaching 2 cycles of F,? Using the ACF as a measure of the various potential correspondences within a kinematogram we can attempt to validate our conjecture regarding the locus of d_: if the displacement we take as the upper spatial limit is correct, we should be able to use it account for direction discrimination performance for displacements beyond that limit, because the upper limit determines which of the potential matches are available to the motion detectors. We start by assuming that the initial rise in error rate does indeed represent an upper spatial limit operating on the motion detectors stimulated by the 0.5 octave kinematograms. (In addition, we take the raised error rates for the smallest displacements tested to represent a lower spatial limit.) Measurement of the relevant 20% error points on the mean curve in Fig. 4d yields values of 0.2 cycles of F, and 1.O cycles of F,, for dtin and d,, respectively. Figure 6a shows how these values can be used to account for the fact that a displacement of around 1: cycles of F, gives rise to better than chance direction discrimination. (The displacement illustrated is in fact 1.87 cycles of F,, the local minimum of the mean error function shown in Fig. 4d.) On this figure the rightward stimulus displacement is represented by the lateral offset of the ACF relative to the origin. The spatial limits we have proposed for detectors sensitive to rightward motion (as well as those sensitive to leftward motion) are marked on the horizontal axis. The range of displacements to which each population is sensitive is shown cross-hatched. As can be seen from this figure, the true correspondence between kinematogram frames (represented by the peak of 100% correlation) is unavailable to the detectors, as it is outside their range. However, two other peaks, representing alternative matches between the first and second frames, fall within the spatial limits of the motion sensors. The larger peak, i.e. the better match, implies a relatively small displacement in the correct direction (right). We assume that this match, yielding twice the
Fig. 6. (a) The figure uses the autocorrelation function to illustrate the potential correspondences yielded by a 0.5 octave r.d.k. with an instantaneous rightward displacement of 1.87 cycles of F,. The displacement is represented by the rightward shift of the ACF relative to the origin. The proposed spatial limits of the motion sensors tuned to the stimulus are marked on the horizontal axis. These limits are taken from the mean direction discrimination data presented in Fig. 4d. Thus, the crosshatched region to the right of the origin represents the spatial range of the population of sensors selective for the rightward motion of the narrowband stimulus. Similarly, the crosshatched area to the left of the origin marks the spatial range of the stnsorsselective for leftward motion. The true correspondence between the frames of the kinematogram is unavailable to the motion sensors, as the displacement has taken the peak of full correlation outside their spatial range. However, a second peak, repmsenting an alternative partial correspondence between the frames, falls well within the range of the rightward detectors. On this basis we would expect the observer to be able to respond to the direction of this displacement at a better than chance rate. (b) Illustrates a rightward displacement of 1.22 cycles of F,. For this displaoement, a single low quality match falls within the range of the leftward sensors. No match falls within the range of the rightward sensors. From this, we would expect direction discrimination to be worse than chance.
correlation of the alternative, will dominate perception and thus direction discrimination will be better than chance. Figure 6b illustrates a displacement of around 1: cycles of F, using the conventions employed in Fig. 6a. (Again the actual displacement illustrated is the value associated with the local extreme, in this case the maximum at 1.22 cycles
ROBERTCJ.EARYand OLIVERJ. BRADDICK
of F,.) This stimulus appears as a movement in the direction opposite to the true displacement. Figure 6b shows that the true correspondence between the stimulus frames is unavailable to the motion process. In fact, for this rightward displacement, the only peak of correlation to fall within the range bounded by d,,,, and d_ (at -0.84 cycles) implies a leftward motion. Thus, our estimates of dmieand d,,, are able to provide a reasonable qualitative account of the non-monotonic nature of the error functions we have obtained. We therefore conclude that although narrow-band kinematograms yield non-monotonic error functions, an upper spatial limit may be properly measured by locating d,, on the initial rising portion of the displacement versus error function. However, it should be noted that the estimates of the spatial limits we have used in our illustrations are based on the somewhat arbitrary definition of dmi,, and d,, in terms of 20% error rates. It seems likely that these “limits” represents points on a smooth curve of displacement sensitivity, and it would be unwise to assume a complete insensitivity to potential correspondences that fall just outside the spatial range bounded by dminand In practice one could expect multiple d zihes, representing displacements of different directions and extents to be available to the motion process. This suggestion is consistent with the introspections of the observers who reported that large displa~ments sometimes appeared as a pair of semi-coherent, opposing, movements. Occasionally, subjects reported that these movements appeared to occur in succession, as if the image had “bounced”. For displacements above 1 cycle of F, the mean error rate does not diverge from chance by more than around 25%, a fact which may well reflect these ambiguities. It should be added that in this simple analysis we have only been concerned with the positions of the peaks in the ACF (the “matches”). It would be of interest to expand on the quantitative detail of this kind of model, and experiment with some variants. For example, a mechanism including some form of summation within the displacement range of the detectors, would be a plausible alternative to the scheme presented here. It would also be of interest to extend the analysis to include variations on the images we have here. It is possible that the correspondence *While we do not have a complete data set, it looks as if d,,,, would scale in a similar manner,
between the autocorrelation function and the data, is a special case that would not hold in the case of, for example, filtered images of sparse dot patterns. d-
SCALES WITH F, BEYOND 4C/DEG
The results for 0.5 octave bandwidth patterns showed a closely accurate scaling with spatial period, for centre frequencies of 1.33-5.33 c/deg (Fig. 4). That is to say, dm,r*is inversely proportional to F, (cf. the 15 min arc limit of Braddick, 1974). Chang and Julesz (1985), using 1.Ooctave bandwidth, reported a similar scaling, but one which broke down ~mpletely for spatial frequencies above 4c/deg. They attribute their constant d_ at higher frequencies to the action of a second, cooperative, process. However, there are some questions about this interpretation. Firstly, Chang and Julesz covaried viewing distance and stimulus area (in a manner that is somewhat unclear), to bring about changes in spatial frequency. Cleary and Braddick (1990) demonstrate an area dependence of the effect of spatial blur on dme,y.It is possible that stimulus area is also and important variable in the analysis of Chang and Julesz’s scaling data. However, Chang and Julesz do present a nearly complete set of data for a single viewing distance; a set which shows the breakdown in scaling. If this data set represents a single stimulus area (again this is not entirely clear from their description) then our suggestion is plainly invalid. A second concern relates to Chang and Julesz’s use of a standard staircase procedure to measure d_ . In view of the non-monotonic direction di~~mination ~rfo~ance found for narrow band-stimuli, the use of a staircase technique may be inappropriate. We have shown that in the case of high frequency narrow-band stimuli, an increase in displacement of only a few minutes of arc can lead to a dramatic improvement in direction di~~mination performance. Chang and Julesz used a fixed size (i.e. one pixel) displacement increment for two consecutive correct responses, regardless of F,. Under such circumsEances it is possible that the use of a staircase procedure would over-estimate d_ for high, but not low, spatial frequencies, We must emphasise however, that this is merely speculation. Without details of the size of the unit step at the retina, it is not posible to substantiate any criticism of the technique employed.
With above points in mind, we have measured d_ for a set of 1.0 octave band-pass kinematograms with 5 different centre frequencies (0.66, 1.33, 2.66, 5.33 and 10.66c/deg). The broadband white noise from which these stimuli are filtered has a flat power spectrum, it does not contain equal power in equal octaves. As a result the contrasts of the band-pass stimuli vary with centre frequency, by a factor of 10, with the Michelson contrast of the lowest frequency stimulus falling to approx. 5%. (These rough values were calculated on the basis of and peak and trough values read from a sample of the filtered images used in the experiments.) Chang and Julesz give no details of the contrasts of their stimuli, and so two conditions were employed in the present experiment. In the first, presented to all observers, the various stimulus contrasts were not adjusted. In the second, viewed by R.C. only, the contrasts of all the stimuli were normalised to 50%. That is to say, the range of grey levels present in each of the filtered images was normalised linearly over a fixed range of grey levels, the extrema of which represented a Michelson contrast of 50%. It should be noted that this procedure does not change the relative power of the Fourier components of the stimulus.
Figure 7 presents displacement versus error functions for these 1.0 octave stimuli. In comparing these functions with the results obtained for the 0.5 octave stimuli it is clear the oscillations in error rate for displacements above d_ are less marked for the larger bandwidth. In particular, there is only patchy evidence of the reversed motion phenomenon clearly demonstrated with 0.5 octave patterns. This is understandable &en that the ACF for the 1.0 octave patterns is of the same form as that for the 0.5 octave, but is more severely damped. It seems that the quasi-correspondence that supports reversed motion in the 0.5 octave stimuli provides too poor a correlation to do so reliably in the case of the 1.0 octave images. However, it is clear that even with the larger bandwidth, the error rate for large displacements does not simply flatten out at 50%; displacement increments beyond d,, can lead to error rates substantially and significantly below chance. As described above, d,, was taken to be the 20% point on the initial rising portion of the error function. Figure 8 plots the values for d,,_ against F,, for three observers (the fourth set of
points on this figure relate to the contrast equalisid stimuli observed by R.C.). It is clear that over the whole range of F, measured, d_ increases steadily with decreasing Fc, at a rate that falls slightly short of the exact inverse scaling represented by a slope of - 1. The dashed line plotted on &is figure represents an inverse scaling of d_ and Fc, where d,, equals 1 cycle of F,. Even at the lowest spatial frequency tested, where the measured values of d_ centre around 5 of a cycle rather than 1 cycle, none of the observers fall to a level of performance that would imply a i cycle spatial limit. In comparing the results for the two contrast conditions for observer R.C., it is apparent that the nomalisation to constant contrast has little overall effect, although in the case of the lowest F, tested (which prior to no~li~tion has the lowest contrast-around 5%) the increase to 50% contrast seems to yield a small increase in dmar-
In comparing these results with those of Chang and Julesz (1985), we note that we find no evidence of a breakdown in scaling above 4 cyclesfdeg. While it is fair to say that we have only tested one value of F, that is substantially above 4c/deg we would argue that since all three observers yield values close to the scaling prediction (under two conditions for RC.) there is a clear discrepancy between the results of the two studies. GENERAL D~SSION Taken together, the results of these experiments indicate that the upper spatial limit for narrow-band apparent motion is close to 1 cycle of F,, over a fairly wide range of spatial frequencies. At a qualitative level, the approximate inverse scaling of d_ with spatial frequency is consistent with both current models of the motion sensor, as well as the electrophysiological data of Baker and Cynader (1986). Can we however, reconcile the obtained 1 cycle spatial limit with the quantitative predictions of the theoretical and ph~iolo~l studies? As noted in the introduction, both correlation and gradient models of band-pass motion sensors imply an upper spatial limit equivalent to f a cycle of the detector’s preferred frequency. Baker and Cynader’s study of striate cells suggests a somewhat lower limit. In assessing the apparent discrepancy between these values and the spatial limit we have obtained, it must be noted that F,, need not retlect the preferred
ROBERT CLEARY and OLIVER
Zig. 7. Example disphcmcnt VI error functions for 1.O octave h&pi~8 rtiarpli with 5 vplw~ of F, . 0: 5.33 cjdcg l : 10.66c/d@ (a) OFmrvcr R.C. (0: 0.66c/dcg v: 1.33 c/de&l; x : USc/dee; (b) Observer J.N.
frequency of the mechanisms me&at&g direction discrimination. We employed stimuli well above threshold contrast, which are likely to stimulate a wide range of spatial frequency tuned channels. While the stimuli then&ves are restricted in spatial frequency, it is unlikely that the channels they stimulate are equally restricted. For example, Anderson and Burr’s (19gS) masking study s that over the range of frequencies we have tested, the halfheight bandwidth of movement sensitive channels would probably be between one and two octaves. In addition, it seems that motion se&tive mechanisms saturate at rather low contrasts. Nakayama and Silverman (1985) measured minimum displacement thresholds for
~~~~y~~u~~~as a function of cmtmt, and rqmrt the ts of saturation for contrasts as low as 2%. Keck, Palella and Pantle (1!376) found that the strength of the motion after4ect initially rises line&ywiththecontmstoftheadaptingstimulus, but then saturates at a contrast of around 5%. Giventheseresults,de&ctorswithapr&rred f&+ency well away from Fe may have contributed to performance in our experiment. If the direction disc&in&ion perfotnasnce was mediated by sensors tuned to frequencies 1 octavebelowF,cd,co&beexpressedasfa cycle of the detectors’ pmfarmd frequency. In order to isolate the sensors tuned to F, in
The spatial limit found in electrophysiological studies depends on the cortical area studied. Mikami, Newsome and Wurtz (1986) measured the limits of displacement tuning for both striate and extra-striate units in the macaque. They found that cells in extra-striate area MT were generally tuned to larger displacements than was the case in area Vl . They also measured the way in which this displacement tuning changed as a function of eccentricity. They found that the data for MT were well fitted by the psychophysical measures of Baker and Braddick (1985a), while the Vl results were not. While it is always difficult to draw firm conclusions from ?c (cyclcc/dc# ) such comparisons, they suggest that sensitivity Fig. 8. d, for 1.0 octave stimuli, plotted as a function of to large displacements or velocities is probably F, . The dashed line represents an inverse waling of d_ with mediated by the MT cells. Although displaceF,, such that d, equals one cycle of F,. (0: R.C.; x : R.C. normal&d contrast; 0: I.H.; 0: J.N.) ments limits in MT have not been studied as a function of spatial frequency, it is possible that attempts to relate the psychophysical results we particular, direction discrimination could be have obtained to the striate unit data of Baker measured at contrast threshold, or one could and Cynader (1986), are inappropriate. There is certainly a substantial discrepancy between the turn to adaptation/masking techniques. An alternative explanation exists of the seem- data sets, and not only in terms of d,,. Baker and Cynader report that the majority of their ingly high value we have obtained for &. Adelson and Movshon (1982), in their analysis cells exhibited a tuning to displacements around of the “aperture problem”, have presented d of a cycle. Our results indicate that direction compelling evidence that motion is initially en- discrimination errors have already risen to 20% coded by a range of orientation selective sen- for displacements of $ a cycle. It would obvisors, each of which extracts a single oriented ously be of great interest to determine the component of an object’s two-dimensional displacement tuning of directionally selective motion vector. The greater the angle between a MT cells, in terms of their spatial frequency particular component and the true direction of tuning. motion, the smaller the velocity or displacement of that component will be, reducing to zero at CONCLUSIONS 90 deg to the direction of motion. Thus, in a 2D kinematogram, a motion sensor selective for We have used narrow-band random dot kinorientations oblique to the true direction of ematograms to establish that an approximate motion will be simulated by a displacement that inverse scaling holds between d,_ and spatial is smaller than the displacement in the true frequency. This result provides support for the direction. Given this analysis, it is possible that idea that a range of motion sensors exist, poseven though individual motion sensors might be sessing similar spatial organisation but varying restricted by an upper spatial limit of i a cycle, in scale (and hence spatial frequency tuning). direction discrimination for displacements Because narrow band stimuli are quasilarger than this value could be mediated by periodic, apparent reversals of direction occur, sensors tuned to directions away from the true and must be taken into account in determining direction of motion. However, Cleary and Brad- d ,_. When we do this, we find no evidence of dick (1987) have examined this issue by com- the breakdown of scaling at high frequencies paring values of d_ obtained with one and reported by Chang and Julesz (1985), instead we two-dimensional band-pass images. They find find an inverse scaling over the whole of the that similar levels of performance are obtainable frequency range from 0.6 to 10 c/deg. with the two types of stimuli. This result demonWe have demonstrated that direction disstrates that the present data cannot be explained crimination may be maintained beyond f a cycle in terms of the stimulation of off-axis motion of image’s centre frequency, a result that would be unobtainable with the more conventional use sensors.
ROBERTCLEARYand OLI~R J.
of sinusoidal stimuli. However, to determine the true dN.K for a single channel, measurements need to be made near contrast threshold. Acknowledgement-This work was supported by an M.R.C. post-graduate studentship, awarded to R.C. and an M.R.C. project grant, awarded to O.J.B.
REFERENCES Adelson, E. H. & Bergen, J. R. (1985). Spatio-temporal energy models for the perception of motion. Journal of the Optical Society of America, A2, 284-299. Ad&on, E. H. & Movshon, J. A. (1982). Phenomenal wherence of moving visual patterns. Nature, Landon, 3m, 523-525.
Anderson, S. J. & Burr, D. C. (1985). Spatial and temporal selectivity of the human motion detection system. Vision Resets&, 25, 1147-I 154. Anstis, S. M. (1970). Phi movement as a subtraction process. V&ion Research 10, 1411-1430. Baker, C. L. & Braddick, 0. J. (1982). The basis of area and dot numbers dibcts in random dot motion pemcption. Vi&m Research 22, 1253-1260. Baker, C. L. & Braddiik, 0. J. (1985a). Eccentricitydependent scaling of the limits for short-range apparent motion perception. Vition Research, 25, 893-812. Baker, C. L. & Braddick, 0. J. (1985b). Temporal properties of the short-range process in apparent motion. Perception. 14, 803-812. Baker, C. L. & Cynader, M. S. (1986). Spatial receptive fictd properties of direction selective neurons in cat striate cortex. Jaurnal of Neuraphysiology 55, 1136-I 152. Bell, H. H. & Lappin, J. S. (1973). SufBcient wnditions for the discrimination of motion. Perception and Psychophysics, 14, 45-50. Braddick, 0. J. (1974). A short range process in apparent motion Vision Research, 14, 519-527. Chang J. J. & Julesz, B. (1983). Displacement limits for spatial frequency filtered random-dot cinematograms in apparent motion. Vision Research, 23, 1379-1385. Chang, J. J. & Julesa, B. (1985). Cooperative and non-
cooperative processes of apparent movement of randomdot cinematograms. Spatial Vision, 1, 39-45. Cleary, R. & Braddick, 0. J. (1987). Apparent motion in one- and two-dimensional band-pass images. Perception, 16, A38. Cleary, R. & Braddick, 0. J. (1990). Masking of low frequency information in short-range apparent motion. Vision Research, 30. 317-327. Julesz, B. (1971). Foundations of cyclopian perception. Chicago: University of Chicago Press. Keck. M., Palella, T. D. & Pantle, A. (1976). Motion aftereffect as a function of the contrast of sinusoidal gratings. Vision Research, 16, 187-195. Lappin, J. S. 8s Bell, H. H. (1976). The detection of coherence in moving random dot patterns. Vision Reseurch, 16, 161-168. Marr. D. & Poggio, T. (1979). A computational theory of human stereo vision. Proceedings of the Royal Society, London, B204, 301-328. Marr, D. & Ullman, S. (1981). Directional selectivity and its use in early visual processing. Proceedings of the Royal Society. London, 8211, 151-180. Mayhew, J. E. W. &Frisby, J. P. (1981). Phychophysical and computational studies towards a theory of human stereopsis. Artificial Intdlegence, 17, 349-385. Mikami, A., Newsome, W. T. & Wurtz, R. H. (1986). Motion selectivity in macaque visual cortex: Spatiotemporal range of directional interactions in MT and VI. Journal of Physiology, London, 55, 1328-1339. Nakayama, K. & Silverman, G. H. (1985). Detection and discrimination of sinusoidal grating diilrocmcnts. Journal of the Optical Society of America, A.2 267-274. van Santen, J. P. H. & Sperling, G. (1984). A temporal covarience model of motion perception. Journal of the Optical Society of America, AI, 451-473. van Santen, J. P. H. UCSperling, 0. (1985). Elaborated Reichardt detectors. Journal of the Opticai Society of America, A.2 300-321. Turano, K. & Pantle, A. (1985). Discontinuity limits for the generation of visual motion aBereffects with sine and square wave gratings. Journal of the Optical Society of America, A2, 260-266. Watson, A. B. & Abumada, A. J. (1985). Model of human visual motion sensing. Journal of the Optical Society of America, AZ, 322-34 1.