VS03CH14-Eckstein

ARI

11 August 2017

11:25

Annual Review of Vision Science

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

Probabilistic Computations for Attention, Eye Movements, and Search Miguel P. Eckstein Department of Psychological and Brain Sciences, University of California, Santa Barbara, California 93106-9660; email: [email protected]

Annu. Rev. Vis. Sci. 2017. 3:319–42

Keywords

First published as a Review in Advance on July 26, 2017

attention, search, scene context, ideal Bayesian observer, eye movements

The Annual Review of Vision Science is online at vision.annualreviews.org

Abstract

https://doi.org/10.1146/annurev-vision-102016061220 c 2017 by Annual Reviews. Copyright  All rights reserved

ANNUAL REVIEWS

Further

Click here to view this article's online features: • Download figures as PPT slides • Navigate linked references • Download citations • Explore related articles • Search keywords

The term visual attention immediately evokes the idea of limited resources, serial processing, or a zoom metaphor. But evidence has slowly accumulated that computations that take into account probabilistic relationships among visual forms and the target contribute to optimizing decisions in biological and artificial organisms, even without considering these limited-capacity processes in covert attention or even foveation. The benefits from such computations can be formalized within the framework of an ideal Bayesian observer and can be related to the classic theory of sensory cue combination in vision science and context-driven approaches to object detection in computer vision. The framework can account for a large range of behavioral findings across distinct experimental paradigms, including visual search, cueing, and scene context. I argue that these forms of probabilistic computations might be fundamental to optimizing decisions in many species and review human experiments trying to identify scene properties that serve as cues to guide eye movements and facilitate search. I conclude by discussing contributions of attention beyond probabilistic computations but argue that the framework’s merit is to unify many basic paradigms to study attention under a single theory.

319

VS03CH14-Eckstein

ARI

11 August 2017

11:25

1. THE PROBLEM

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

The alarm rings. You walk toward the medicine cabinet in the bathroom. Find the toothbrush. Find the toothpaste. Find the comb. Find the deodorant. At the kitchen, find the coffee. Find the milk in the refrigerator. Find the keys to the car. Find the wallet. Find the elevator. Find the button to operate the elevator. Find the key that opens the office. Find the keyhole. Find the power button on the desktop. Find the icon for the email. Even in the post-hunter-gatherer era, a day in the life of a human still involves a long sequence of brief visual searches. And in the fast pace of modern society, the failure of one of these routine searches can delay our day and be a source of frustration. Each search often involves moving the eyes to point the central area of the human retina (the fovea) to regions of the scene to acquire visual information to allow successful search of the next target. The fovea processes visual information with high spatial detail, and objects surrounding the search target do not hinder (or crowd) its detection. Visual processing away from the fovea (the visual periphery) is mediated by a diminished density of cone photoreceptors, higher degree of convergence of cones’ outputs onto retinal ganglion cells, and fewer neurons in the primary visual cortex per millimeter of retina. As a result, fine spatial discriminations are not possible with peripheral processing, and away from the fovea, the detrimental effects of crowding are high (Rosenholtz 2016, Strasburger et al. 2011). The ability to detect and discriminate objects in the visual periphery is highly deteriorated (Findlay & Gilchrist 2003, Geisler & Chou 1995, Levi 2008, Rovamo et al. 1984). Thus, humans must often make eye movements to utilize the fovea to explore a scene and find the searched target. But why have many animals that rely on vision as the main sensory input evolved this varying-resolution (foveated) visual system? What are the costs of a visual system that supports foveal high spatial detail across the entire visual field? The most common answer involves the high metabolic cost of implementing homogeneous processing with high spatial detail over the entire visual field. Over a fourth of the human brain is already dedicated to vision, and visual processing with foveal spatial detail across the entire visual field would increase that substantially. The density of cones in the fovea is approximately 20 times larger than at 10◦ into the periphery and is 90 times larger than at the far visual periphery (Curcio et al. 1990). The fovea, which occupies 0.01% of the retina, utilizes approximately 10% of the neuronal machinery in the primary visual cortex (Azzopardi & Cowey 1993). A high-resolution processing system across the entire visual field (matching the fovea’s ratio of primary visual cortex neurons per millimeter of retina) would result in approximately a 1,000-time increase in the size of the primary visual cortex. Instead, many animals have evolved a varying-resolution visual system where a central area is given preferential processing and representation in the brain. A foveated visual system with eye movements might achieve comparable search accuracy to that of a visual system with full spatial detail across the entire visual field but with large computational savings (Akbas & Eckstein 2014). A successful foveated visual system relies critically on the guidance of eye movements to efficiently explore and extract information from scenes and complete a search within a short time frame. Search accuracy of a foveated system with random eye movements (Akbas & Eckstein 2014) will show the limitations of peripheral processing unless there is unlimited time to exhaustively fixate the entire scene. The brain of humans and many animals uses peripheral processing to extract critical information to guide the eyes across the scene. Peripheral information is also utilized to influence our decisions when eye movements are not able to foveate a region or when the visual event takes place while the observer is foveated elsewhere. In this context, we refer to covert attention as the human ability to utilize prior knowledge about the visual environment to selectively process or integrate visual information across the visual field [but for a broader or different view, see Carrasco (2011) and Tatler et al. (2011)]. The prior

320

Eckstein

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

ARI

11 August 2017

11:25

knowledge is either acquired through communication from another organism or through learning. For a foveated visual system, one of its main purposes is to optimize how peripheral information guides eye movements. The other important function of covert attention is to improve the accuracy of perceptual decisions. Assume that you have to make a quick judgment but do not have time to look at all spatial locations: Covert attention can select and integrate visual information across the visual periphery to improve decision accuracy. What properties of the visual environment can be selected to guide eye movements and influence our perceptual decisions? There is consensus that information about the searched target, including basic features such as size, orientation, shape, and color are all utilized by the brain to guide eye movements and influence our perceptual decisions during search (Bravo & Farid 2009, Eckstein et al. 2007, Findlay 1997, Malcolm & Henderson 2009). And accordingly, most models of human search include a representation of the features corresponding to the searched target (Duncan & Humphreys 1989, Eckstein et al. 2001, Kanan et al. 2009, Najemnik & Geisler 2005, Navalpakkam & Itti 2005, Wolfe 2007, Zelinsky 2008, Zhang & Eckstein 2010). Models based solely on bottom-up saliency (i.e., the visibility of different regions in an image irrespective of the behavioral goals) often fall short of predicting eye-movement control during search (Hayhoe & Ballard 2014, Henderson et al. 2009, Koehler et al. 2014, Tatler et al. 2011). Similarly, computer vision models might rely on generic object detectors to identify potential targets, but the majority of state-of-the-art algorithms utilize target features to detect specific objects (Ren et al. 2016, Zeiler & Fergus 2014). Does relying solely on target features ensure successful search? For a foveated visual system, it does not. Oftentimes, a target can be small, crowded by other objects, and difficult to detect in the visual periphery. For such scenarios, the human brain must utilize other properties of the visual environment to guide eye movements and facilitate decisions. Objects in the visual environment are typically not randomly located: Fruits tend to be on or under trees, cars on the road, chimneys on houses, and airplanes in the sky. Humans have a remarkable ability to learn statistical relationships among objects from a young age (Bulf et al. 2011, Fiser & Aslin 2002), and such regularities can be exploited to facilitate a variety of perceptual decisions. How crucial are these statistical relationships among objects and the scene’s low-level properties with likely target locations? Just retrace our original search, but imagine that the keyhole on an office door is at ankle level, the elevator’s buttons are at the back wall of the elevator, and the desktop’s power button is on top rather than at the front or back of the computer. Your morning would not come to a good start. The focus of the current review is on how organisms can take advantage of the statistical relationships between the visual environment and the target and utilize probabilistic computations that combine different sources of information in the visual periphery to influence their perceptual decisions and eye movements. We rely on the well-known Bayesian probabilistic framework (Geisler 2011, Knill & Pouget 2004). The current focus is less on whether the brain actually computes a probability1 [see Ma (2012) for such discussion] but rather on how computations that take into account probabilistic relationships among objects or visual forms in the visual environment can give rise to performance benefits in biological and artificial organisms, even without postulating additional limitations on the mechanisms mediating covert attention (Bundesen 1990, Posner et al. 1980, Treisman & Gelade 1980).

1 It is well known that computations can take advantage of statistical relationships in the environment without computations that actually compute probabilities (Green & Swets 1989). We note that other investigators used the term probabilistic computations to refer to the nature of the neural computation and whether it is probabilistic in nature (Ma 2012, Ma et al. 2015).

www.annualreviews.org • Probabilistic Computations for Attention

321

VS03CH14-Eckstein

ARI

11 August 2017

11:25

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

Here is a breakdown of the sections to follow. Section 2 presents the general probabilistic framework and its principles as applied to the study of visual attention and search. It also discusses its connections to the well-established theory of the subfield of sensory cue combination. Section 3 is an overview of the experimental results that can be accounted for by the framework. Section 4 presents a sample data set for a spatial cueing task for humans, monkeys, and bees to illustrate that the ability to use computations that take into account probabilistic relationships in the environment to make decisions is not just human but also likely pervasive across many animals. Section 5 describes studies trying to assess what aspects of scenes are utilized to guide eye movements during search and discusses the evidence for probabilistic computations in this domain. Section 6 discusses possible neural implementations of the computations. The ending section discusses possible future paths of research as well as instances that cannot be accounted for solely based on probabilistic computations.

2. PROBABILISTIC COMPUTATIONS AID PERCEPTUAL DECISIONS AND EYE MOVEMENTS There have been distinct experimental paradigms to study visual attention in the laboratory. Visual search and the cueing paradigm are arguably the most prominent paradigms. In the search paradigm (Figure 1a), a target is embedded among distractors. The experimenter typically manipulates the number of elements (or distractors) in the display and measures the impact on performance measures, such as reaction times or search accuracy. The change in slope of the response time versus set size or the degradation in accuracy with increasing number of elements (set size) is used to make inferences about the attentional mechanisms and the function of attention (Duncan & Humphreys 1989, Wolfe 1998b). In the cueing paradigm (Figure 1b), a target appears at one of M spatial locations (typically two or four) and a highly visible cue (e.g., a box or arrow) indicates the likely target location. The cue is assumed to be an experimental manipulation of covert visual attention (Posner et al. 1980). Performance facilitation on validly cued trials (those in which the cue indicates the target location) relative to invalidly cued trials is used to interpret how attention changes visual processing. Another prominent paradigm is the contextual cueing paradigm (Figure 1c). It consists of a search task in which a specific configuration of target and distractors is repeated across trials and alternated with novel configurations. Search times to find the target are typically shorter for the repeated configuration relative to novel configurations. The result is interpreted as a learned background guiding attention toward the target. In real scenes, finding a search object is often facilitated when the target is presented at a spatial location where it often appears relative to the background and other objects in the scene. In contrast, the search time is lengthened, or the target is even missed when it is located at an unexpected location (Figure 1d ). Eye movements in many of these paradigms also tend to be directed more often toward the target when it appears at the cued location or at the spatial position associated with higher probability in synthetic displays or scenes (Brockmole & Henderson 2006, Droll et al. 2009, Liston & Stone 2008, Peterson & Kramer 2001, Walthew & Gilchrist 2006). Performance benefits or costs in each of these paradigms have been classically associated to many distinct limitations or attributes of covert attention: (a) an inability of visual attention to make fine discriminations in parallel across the entire scene and a temporally serial requirement to bind distinct visual properties of an object or make fine discriminations (Treisman & Gelade 1980, Wolfe 1998a); (b) a zoom metaphor in which attention enhances the visual processing at an attended location (Posner et al. 1980); and (c) the limited resources of attention that, when deployed at a likely target location, enhance the processing of the target.

322

Eckstein

VS03CH14-Eckstein

ARI

11 August 2017

11:25

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

a

b

+

+

Visual search

Cueing paradigm

c

d

Contextual cueing

Scene context

Figure 1 Common experimental paradigms to study visual attention. Studies typically measure response times to find the target (tilted red line) or present the display for a limited time and measure accuracy. (a) Visual search paradigm, (b) cueing paradigm, (c) contextual cueing paradigm, and (d ) scene context paradigm. When studying covert attention to eliminate low-level confounds, a central fixation is used (a,b) and elements can be placed at equal retinal eccentricity.

Yet, in the majority of these experimental manipulations, the stimulus that leads to greater performance provides additional information (or cues) to the observer about the likely location of the target. When the hard-to-detect or hard-to-discriminate target appears with a very visible cue, the cue is providing additional information about the likely target location. When the number of distractors in a search array is physically reduced, there is reduced uncertainty about the likely target location. Using boxes or arrows to indicate the subset of likely target positions from a larger search array also provides additional information about the target. Repeating a distractor– target configuration also provides a consistent relationship about the location of the target, given the specific distractors. Such additional information could potentially be utilized to improve the accuracy of observer decisions and eye movements. To understand the contributions of additional target location information to task performance, one must first consider that the sensory evidence utilized by the brain to reach decisions is variable and subject to noise. The variability arises from uncertainty in the viewing vantage point of objects, the illumination, the characteristics, and configuration of objects in the visual environment. For example, a target might be partially occluded by another object and a distractor might appear at a vantage point that makes it confusable with the target. In addition, noise arises from the cellular www.annualreviews.org • Probabilistic Computations for Attention

323

ARI

11 August 2017

11:25

machinery that processes information, including fluctuations in the opening and closing of ion channels that influence action potentials (Faisal et al. 2008, Knill & Pouget 2004, Shadlen & Newsome 1998). These sources’ variability can lead distractors to be associated to higher sensory evidence than that associated with the actual target and result in a decision error. In this context, an organism that utilizes any additional information about the likely location of the target to integrate or evaluate neural signals can reduce the unwanted contributions of decoy sensory evidence arising from distractor locations. Such strategy will lead to reduced decision errors by the organism. So what would be a principled approach to assess the expected performance benefits from the additional visual information? A common approach has been to consider an ideal observer that utilizes all information provided in the environment to optimally make decisions (Barlow 1980, Geisler 2011). The model is often linked to Helmholtz’s classic concept of reverse inference: The observer makes optimal inferences about the state of the world from the sensory data. The model considers a set of hypotheses about the state of the world (e.g., target presence or identity of the stimulus). It aims to estimate the probability of each of the hypotheses by considering the likelihood of obtaining the observed sensory responses given that the ith hypothesis (Hi ) is true and the prior probability (prevalence) of each hypothesis. When making a decision, the ideal observer simply chooses the hypothesis associated with the highest probability. The ideal observer serves as a benchmark to compare against human performance and assess whether performance improvements or detriments can be captured by these variations in the information in the visual environment (Geisler 2011). The requirements for stipulating the model are a well-defined perceptual task, knowledge of the collection of possible stimuli, and the statistical properties of the sensory representations. This framework has been commonly applied to a variety of problems in perceptual psychology, including the subfield of cue combination (Geisler 2011, Landy et al. 2011). This subfield studies how humans integrate different sources of information to perform a detection, discrimination, or estimation task. The investigators vary the number of cues available to the observers and assess the improvement in performance with additional sensory cues2 relative to the improvements of an optimal combination. This framework has been successfully applied to the understanding of how humans combine visual cues, such as depth cues (Hillis et al. 2004; Jacobs 1999; Landy et al. 1995, 2011), texture cues (Landy & Kojima 2001), and cues across sensory modalities (visual, auditory, haptic; Ernst & Banks 2002, Helbig & Ernst 2007). In some sense, many of the attentional manipulations also vary the sensory cues available to the observers. Typically, these cues change the available information to localize the target. Yet, the traditional field of attention has more often interpreted the variations of performance across experimental conditions as reflecting limitation in attention or some enhancement at the attended location. The same approach as in cue combination can be adopted to evaluate the contribution of attentional cues: evaluation of performance by an ideal observer. The ideal observer approach has been applied to many of the visual attention tasks such as visual search and the cueing paradigm [see Vincent (2015) for a tutorial of the mathematics]. Psychophysical studies can be used to compare optimal utilization of cues or processing in search of arrays relative to that of humans. These studies assess whether the attentional facilitations or costs can be accounted for by simple variations in the available information that the observer can select or integrate to reach a perceptual decision. The framework has also been applied to study

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

2 The term cues has been used in many different subfields of the perception literature, including cue combination and attention. Here, we use the term sensory cue to refer to cues from the cue-combination subfield and to distinguish it from attentional cues.

324

Eckstein

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

ARI

11 August 2017

11:25

P(Hi | x, y) ≈ P(Hi) · P(x | Hi) · P(y | Hi)

Sensory cue combination

P(Hi | x, y) ≈ P(Hi) · P(x, y | Hi)

Basic Bayes

P(Hi | x, y) ≈ P(Hi) · [π · P(x | Hi) · P(yc | Hi) + (1 – π) · P(x | Hi) · P(yu | Hi)]

P(Hi | x, Yc ) ≈ P(Hi ) · π · P(x | Hi) Cued Uncued P(Hi | x, Yu) ≈ P(Hi) · (1 – π) · P(x | Hi)

1 cue

2 cues

π = 80%

π = 20%

π = 80%

π = 20%

Probabilistic sensory cues

Highly visible attentional cue

Figure 2 (Left) General mathematical formulations. (Center) Labels for each scenario. (Right) Sample tasks and stimuli corresponding to each scenario. (Top row) Common Bayesian framework for cue combination of two statistically independent cues; the framework reduces to the product of the likelihoods [P(x | Hi ) · P( y | Hi )] of each cue’s sensory information given the Hi hypothesis (e.g., presence of a target) and the prior probability of the hypothesis, P(Hi ). (Bottom row) The common Bayesian framework for the study of attention with a highly visible attentional cue predictive of the target. The sensory evidence (about the orientation of the line) at a cued and uncued location is weighted by the prior probability of the presence of the target given the cue (π ). (Second from bottom) A less-explored intermediate scenario (probabilistic sensory cues) in which there are two cues (tilted line and brighter circle), but they only co-occur with a given probability (π ). The scenario with a hard-to-detect cue (circle) that probabilistically co-occurs with the target (tilted line) connects the two subfields (cue combination and visual attention) and their theoretical Bayesian frameworks. In this case, the Bayesian ideal observer calculates the likelihoods of the sensory information for the tilted line under the two possible (and mutually exclusive) events: that it is paired with the bright circle (cued, with probability π ) or that it is paired with the darker circle (uncued, with probability 1 − π ). When the cue is highly visible, the second term is close to zero for the cued case and the first term approximates zero for the uncued case. This reduces the expression to the common formulation for the Bayesian ideal observer utilized for highly visible attentional cues (bottom). Note that additional terms for the likelihood calculation are needed for two location tasks (regarding the second location), but for simplicity, those terms are omitted in this figure.

how scene context facilitates the guidance of eye movements. In fact, there is a connection that is seldom noticed between the common Bayesian frameworks proposed to study cue combination and Bayesian models to study attention. Take, for example, a task in which the observer has to identify a vertically oriented line from a right-tilted line (Figure 2). The investigator measures accuracy deciding whether the tilted line is present or absent (a yes/no task). In a second condition, the original line stimulus is presented with a contiguous circle that is always brighter when presented www.annualreviews.org • Probabilistic Computations for Attention

325

ARI

11 August 2017

11:25

with the titled line signal and is a dimmer circle when presented with the vertical distractor line. Assume that distinguishing the bright circle from the dimmer circle is as challenging as discriminating the tilted line from the vertical line. This is a classic cue-combination paradigm in which the line orientation provides one sensory cue and the circle luminance provides the second sensory cue. Performance correctly identifying the signal (tilted line) will improve when presented with the second sensory cue (bright circle). Now assume a modification of that experiment in which the tilted line is often (80% of the trials), but not always, paired with the brighter circle. This seems to be closer to an attentional experiment except that the predictive cue (the bright circle) is difficult to discriminate. If we take one more step and greatly increase the difference in luminance between the circles so that the bright circle can always be rapidly discriminated from the dim circle, then the experiment becomes a typical attentional cueing paradigm. The circle would be the cue orienting attention toward the likely location containing the harder-to-discriminate tilted line. The optimal Bayesian framework for cue combination (Landy et al. 2011) of statistically independent sources is well known. The most general case of statistically independent sensory cues involves multiplying the sensory evidence (likelihoods) arising from each cue (Figure 2, top row). The Bayesian framework for visual attention is also well known (Eckstein et al. 2002, Vincent 2015, Yu & Dayan 2005). It involves weighting multiplicatively the sensory evidence (likelihoods) about the presence of the target by the prior probability of the signal given a cue or location (cue validity). But such standard attention formulation obviates the detection of the cue by the brain, assuming its detection is error-free (Figure 2, bottom). The probabilistic sensory cues example in Figure 2, in which the cue (circle) is not trivially discriminated, is an experimental condition not typically explored but lies in between cue combination and classic attention paradigms and connects modeling frameworks from both subfields. In such a task, the optimal observer considers the hypothesis of the tilted line being present under both possible scenarios: The two sensory cues jointly appear (the tilted line and bright circle appear together) or the tilted line appears with the dimmer circle. The formulation for this scenario with two sensory cues with a probabilistic dependence can reduce to the attentional formulation (Bayesian weighting of sensory evidence by the prior probability of co-occurrence of cue and target) when the cue is highly visible (see Figure 2). Taken together, the importance of the theoretical framework is that it suggests that utilizing computations that optimally take into account the probabilistic relationship between the target and other visual forms (synthetic cues, other objects, properties of scenes) can lead to performance enhancement. Readers not adept at taking into account the stochastic nature of visual processing in a formal model of search or attention might remain unconvinced by the claimed implications associated with the Bayesian ideal observer. Thus, is there any additional evidence that computations that take into account probabilistic relationships in the environment can benefit decision accuracy? Further evidence comes from the field of computer vision. A number of studies have shown how algorithms utilizing computations can incorporate scene cues that are predictive of the target location and/or its presence. Incorporating the probabilistic relationships between the scene cues and the target improves the accuracy of the object detectors and categorization (Kantorov et al. 2016, Mottaghi et al. 2014, Rabinovich et al. 2007, Torralba 2003, Torralba et al. 2003). Some of the models have focused on global scene properties (Figure 3a), whereas others have included object relationships (Choi et al. 2012). The computational form of some of these models resembles the general formulation of two sensory cues with a probabilistic dependence (Figure 2; probabilistic sensory cues). Importantly, in these proposed schemes, the algorithms do not have an attentional limitation in the underlying feature extraction for target–object processing or a foveated visual

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

326

Eckstein

VS03CH14-Eckstein

ARI

11 August 2017

11:25

a

b

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

Person

Dog

Horse

Person

Person

Horse

Person

Car Car

Person

Person

Car

Car Motorbike

Motorbike

c 0.00

2.78

0.46 1.21 0.75

3.67 4.69

0.26 4.68 4.50

1.68 8.11

Figure 3 Use of probabilistic relationships between global features and target or multiple objects to improve object detector performance. (a) (Left to right) Original image, image with car detector choices indicated by bounding boxes, image representing priors for car locations for that image class, and final image where confidence (represented in the thickness of the bounding box) is modulated by context (Torralba et al. 2010). (b) Use of a probabilistic relationship between objects. (Left to right) Original image with object segmentation, labels for objects without using context, labels with context, and final image with bounding boxes and labels (Rabinovich et al. 2007). (c) An example of a by-product of using probabilistic relationships to find objects. Target objects at unexpected locations (car floating in the sky) can be incorrectly suppressed by the algorithm (Torralba et al. 2010).

system.3 Still, utilizing computations that take into account probabilistic relationships between scene features or objects to the target reduces false positives and misses (Figure 3a–c). Performance improvements do not come about from improved extraction of target features at a contextual location but instead come about as a result of the additional cues (global scene properties or objects that tend to co-occur with the target) providing information about the likely location or presence 3 We note that there are models that seek to perform a cheap computation across the entire image to identify potential target locations and then perform computationally more expensive object detection at a subset of locations.

www.annualreviews.org • Probabilistic Computations for Attention

327

VS03CH14-Eckstein

ARI

11 August 2017

11:25

of the target. Such results also suggest that biological organisms can utilize computations that take into account probabilistic relationships between scene properties and the target to improve perceptual decisions and guidance of eye movements. And such benefits would come about even without considering attentional limited resources or even the foveated nature of visual processing.

3. BEHAVIORAL EFFECTS THAT CAN BE PREDICTED BY COMPUTATIONS THAT TAKE INTO ACCOUNT PROBABILISTIC RELATIONSHIPS

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

One fundamental advantage of synthetic tasks is that it provides the investigator full knowledge about the collection of stimuli in the experiment and allows for fewer assumptions in the formulation of the ideal observer. If the variability in the stimulus is manipulated, then the statistical properties governing the sensory data (often referred to as the generative model) are also known by the investigator. In the past, calculation of performance by an ideal observer without modernday computers was difficult. Thus, researchers investigated approximations to the ideal observer within the broader class of signal detection theory (SDT) models (Cohn & Lasley 1974, Davis et al. 1983, Sperling & Dosher 1986). The commonality in SDT models is (a) each element in a stimulus is assumed to elicit a noisy internal representation that could be a transformation of the likelihood ratio or a simpler scalar; (b) a subsequent integration (summation, weighted average, or maximum) of the sensory variables across multiple locations to a single decision variable; and (c) a decision reached by comparison of the decision variable to a decision criterion (yes/no) or to decision variables from other locations or hypotheses (alternative forced-choice, localization tasks). The rationale for the use of the SDT models remains the same as that of the ideal observer: to assess whether the organism’s benefits or deficits in performance or across experimental conditions can be accounted for by a model that integrates and selects sensory responses utilizing the available information—and, importantly, without assuming any further limitations such as limited resources, enhanced processing (signal-to-noise ratio) at attended locations, or temporal serial processing. For simple tasks, the relationship between the signal detection models can often approximate the ideal observer. As the task becomes more complex, the multiplicity of choices for the integration algorithms in the SDT model increases and the behavioral predictions might vary depending on the specifics of the algorithm and depart from the optimal (Ma et al. 2015, Schoonveld et al. 2007). One advantage of using the ideal Bayesian observer as a starting point for modeling is that there is an unambiguous formulation for a given task, known statistical properties of the sensory variables, and a known utility function [see Vincent (2015) for a more detailed tutorial]. What types of results has this framework been able to explain? A variety of visual search findings can be accounted for by this perspective either when set size is manipulated by using cues indicating the likely locations of the target or by changing the physical appearance of distractors. It can account for set-size effects in a range of tasks, including detection in yes/no and two-interval forced-choice tasks (Davis et al. 2006; Palmer 1994; Palmer et al. 1993, 2000), identification of one of two targets (Baldassi & Verghese 2002, Cameron et al. 2004, Ma et al. 2015), and localization tasks (Cameron et al. 2004). The framework explains how set-size effects are small when target/distractor discriminability is high and increase when targets become more similar to the distractor (Eckstein et al. 2000, Vincent 2015). Distractor heterogeneity can have distinct effects on search performance, depending on the details of how the variability of the target and distractor are manipulated (e.g., equal or unequal variance, adding distractor types). The ideal observer can predict such effects (Ma et al. 2015, Rosenholtz 2001, Vincent et al. 2009), but some 328

Eckstein

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

ARI

11 August 2017

11:25

suboptimal integration rules will not (Rosenholtz 2001). Search asymmetries consist of different set-size effects when the identity of the target is swapped with the distractors. Such findings can be accounted for by assuming that the encoding of variability is different across elements (Vincent 2011a,b). A central finding to support feature integration theory (Treisman & Gelade 1980) is that setsize effects are larger for search arrays in which there is not a unique feature that discriminates the target from all distractors (known as conjunction displays). If each feature in the search array is theoretically treated as a separate dimension, a conjunction display is also an instance in which the target is more similar to the distractors for a variety of algorithms (including optimal) that integrate information across dimensions (Eckstein 1998, Eckstein et al. 2000). Searching for a target of visual characteristics that are unknown but known to be different from an array of identical distractors is referred to as an odd-man-out search. In such displays, increasing the number of distractors does not degrade performance with set size and sometimes even increases it (Bravo & Nakayama 1992, Santhi & Reeves 2004). It has been traditionally attributed to facilitations related to distractor grouping effects (Bacon & Egeth 1991). In the ideal observer framework, the oddity-search set-size effect arises because each extra distractor provides additional information about which features define the distractors. This leads to flatter set-size effects for finding an unknown target among uniform distractors, which is in agreement with human results (Schoonveld et al. 2007). In many of the search tasks, the set size is manipulated by using cues to the relevant locations, but often, the cues are all indicative of the same probability for the presence of the target (1/N, where N is the number of cued locations). A special case is when an uncued location has a nonzero probability of containing the target or there are different cues with different associated probabilities (Droll et al. 2009, Gekas et al. 2015). Also, a fixed location can have increased probability of containing the target (Druker & Anderson 2010, Geng & Behrmann 2005, Walthew & Gilchrist 2006). The accuracy advantages at the high-target-probability locations, whether associated to a fixed spatial location or to a set of cues, can be accounted for within the Bayesian framework (Vincent 2011a,b). The Bayesian ideal observer has also been formulated for the traditional Posner cueing task (Figure 1b) to show that the cueing effect is a by-product of the optimal weighted integration of information across two locations to maximize overall performance across all trials (Eckstein et al. 2002, Yu & Dayan 2005). The model also correctly predicts how cueing effects (hit rate in valid cue trials minus invalid cue trials) vary with the target detectability across yes/no and two-alternative forced-choice tasks (Eckstein et al. 2013, Shimozaki et al. 2003). How well can the model account for changes in eye movements in search arrays? Studies have shown that humans bias their eye movements during search to locations or cues associated with higher probabilities of containing a target (Droll et al. 2009; Jiang et al. 2014; Jones & Kaschak 2012; Liston & Stone 2008; Peterson & Kramer 2001; Shimozaki et al. 2012; Walthew & Gilchrist 2006). Such increased frequency of eye movements toward locations or cues that are highly predictive of the target locations is consistent with the Bayesian framework. However, the measured biases are often smaller than what would be expected from the experimental prior probabilities and a model that tries to maximize the probability of making an initial saccade to the target (Droll et al. 2009, Shimozaki et al. 2012). One common explanation for the discrepancy between human and model is that the goal of eye movements is not necessarily to maximize the probability of initially fixating a target but rather to efficiently explore the visual environment to support a subsequent perceptual decision (Najemnik & Geisler 2005, Renninger et al. 2007). Thus, although efficient exploration will likely involve foveating a highly probable target location, other factors such as minimizing the cost of executing eye movements (Araujo et al. 2001) might influence the destination of initial saccades. www.annualreviews.org • Probabilistic Computations for Attention

329

VS03CH14-Eckstein

ARI

11 August 2017

11:25

1.0 0.9

Proportion correct (valid and invalid trials)

0.8 0.7 0.6 0.5 0.4 Valid Invalid Human Monkey Bees Bayesian ideal observer

0.3 0.2

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

0.1 0.0 0.50

0.55

0.60

0.65

0.70

0.75

0.80

0.85

0.90

0.95

Overall proportion correct (over increasing signal strength) Figure 4 Cueing effects for a simple two-alternative forced-choice task with three intermixed signal contrasts and a cue indicating the probable location of the target. Data for humans, monkeys, bees, and Bayesian ideal observer. Proportion correct on valid and invalid cue trials as a function of overall proportion correct for three signal strengths that were intermixed. Figure adapted from Eckstein et al. (2013).

4. CASE STUDY: EVALUATING OPTIMALITY OF THE UTILIZATION OF A PROBABILISTIC CUE ACROSS SPECIES As an example, we consider a simple case where the organism has to decide which of two spatial locations contains a target. A highly visible cue indicates the probable (e.g., 80% likelihood) location of the target. The signal contrast varies across trials and is not known to the organism. This simple task was chosen to allow comparison of behavior across three species: humans, monkeys, and honey bees. Both monkeys and humans used a similar type of stimuli: a Gaussian luminance signal embedded in visual luminance noise. Monkeys were trained to use a saccadic eye movement to make their choice about the target location. To match the monkey task, humans were also instructed to select the target location with an eye movement. The honey bees were trained to fly to one of two boxes containing a front paper panel with a specific color that could be hard to discriminate from the distractor box. A highly discriminable black pattern co-occurred with the target box in 75% of the trials. As with the human and monkey task, the contrast of the target (discriminability between the color paper panel in the target box versus distractor box) for the honey bee task also varied. Figure 4 shows performance (proportion correct) for the three species and for validly and invalidly cued trials for the different intermixed signal contrasts. For comparison, Figure 4 shows the expected difference in performance across valid and invalid cue trials (cueing effect) for an ideal Bayesian observer. The results show three interesting findings. All three species are utilizing the cue to influence their decisions but in a less pronounced way than an ideal observer. Second, there seems to be a reduction in cue utilization from humans to bees. However, caution should be taken in such an interpretation, given the differences in how the information about the cue was communicated to the organisms (verbal instruction to humans versus training with rewards for the monkeys and 330

Eckstein

VS03CH14-Eckstein

ARI

11 August 2017

11:25

bees). Perhaps the more surprising result is that an organism with a small brain (volume of 1 mm3 and about 1 million neurons) without a layered cortex can utilize probabilistic relationships to influence its decisions. This result is in agreement with a larger body of evidence that the ability to use probabilistic relationships to guide search and improve performance is prevalent across species. Studies have shown cueing effects in rats (Bushnell & Rice 1999, Marote & Xavier 2011) and pigeons (Shimp & Friedrich 1993). In addition, recent studies have shown that pigeons show contextual cueing effect both in scenes (Wasserman et al. 2014a,b) and in search arrays such as those used for the classic contextual cueing paradigm (Gibson et al. 2015). And there is increasing recognition of the role of selective attention mechanisms in the insect brain (de Bivort & van Swinderen 2016, Nityananda 2016).

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

5. PROBABILISTIC CUES IN REAL-WORLD SCENES Much of our discussion has focused on synthetic displays where the investigator varies the number of elements in a search array, varies the properties of the target and distractors, or introduces additional cues that provide information about the likely target location. Synthetic lab experiments provide strong test beds to differentiate across models and are well suited to study the underlying computations utilized by organisms. However, they provide an impoverished visual environment relative to realistic scenes. Real scenes contain cues (scene properties) that provide information about the likely presence, identity, and location of targets. Studies have shown that object recognition and localization in scenes can be facilitated or disrupted by the contextual information (aspect of the scene) provided (Biederman 1972, Oliva & Torralba 2007, Palmer 1975, Wolfe et al. 2011). Scene context also guides eye movements toward expected target locations (Castelhano & Henderson 2007, Eckstein et al. 2006, Koehler & Eckstein 2017a, Mack & Eckstein 2011, Malcolm & Henderson 2010, Neider & Zelinsky 2006, Pereira & Castelhano 2014, Torralba et al. 2006, Vo˜ & Henderson 2010). What has been more difficult to define is what exact components or properties of scenes are utilized by the observer to guide and facilitate search. The term scene gist has often been identified as a critical component-guiding search, although its definition has varied greatly across studies. Gist has been often defined as the basic-level category of a scene (Larson & Loschky 2009, Schyns & Oliva 1994, Thorpe et al. 2001), the background content of a scene (Wu et al. 2014), high-level relations between objects and backgrounds (Fei-Fei et al. 2007), knowledge of whether a particular object belongs in a scene (Castelhano & Heaven 2011), or a description of the main event or focal foreground objects in a scene (e.g., girl sitting on bed; Potter 1976). Studies have shown that object-based information is not necessary for determining a basiclevel category of scenes. Global scene statistics can be used to successfully categorize scenes and reliably predict spatial properties (e.g., openness, ruggedness) about scenes (Greene & Oliva 2009; Oliva & Torralba 2001, 2006). Determining the basic-level category of an image from the global scene statistics can aid computational models in determining the presence of targets and localizing them (Torralba et al. 2006, 2010). Additional studies have also highlighted that placing a target in a semantically inconsistent scene background lengthens the search (Davenport & Potter 2004, Malcolm & Henderson 2010). There has been a recent effort to manipulate individual properties in scenes to identify which are critical components for search guidance. Pereira & Castelhano (2014) used a gaze-contingent paradigm and manipulated the visual information (background, objects in foreground) available in the parafovea. They found that background information provides coarse guidance of eye movements to areas likely to contain a target, and object information provides more precise spatial information about where to search. www.annualreviews.org • Probabilistic Computations for Attention

331

ARI

11 August 2017

11:25

Koehler & Eckstein (2017a,b) have manipulated three different components of scenes (Figure 5a): the background (anything plausibly immovable or nonconfigurable in the scene; e.g., ceilings, floors, sky, ground, trees, doors), object co-occurrence (the object that most typically co-occurs spatially with the target), and multiple-object configuration consisting of other objects spatially more distant from the target but that jointly form a spatial arrangement that provides information about the target location. The results showed that object-based cues (rather than the background) were more critical in guiding eye movements and improving search accuracy. Importantly, there seems to be increasing evidence that the extent to which a scene cue will guide and facilitate search is intricately related to the degree of inherent information provided by the cue about the localization of the target (Koehler & Eckstein 2017a). The associations between cues and targets are determined by their natural co-occurrence and spatial proximity in real scenes, which can be estimated quantifying objective measures of the statistics of relationships of objects and scenes (Greene 2016). The inherent information accessible by humans can be measured by assessing how observers’ explicit judgments about where they expect a target (absent in a presented scene; Figure 5b) are influenced by removing scene cues such as background and objects (Koehler & Eckstein 2017a). What is the evidence for Bayesian-type computations accounting for the search facilitation and increased frequency of eye movements to the target when it appears at an expected location? Unlike synthetic displays, with real scenes, the investigator has less control over the stimulus, the statistics of the sensory representations, and the prior probabilities associating target presence and different cues in the scenes. This prevents the formulation of a Bayesian ideal observer and makes precise quantitative comparisons between human behavior and models more difficult. A number of models that have implemented probabilistic Bayesian-type computations can mimic the facilitation of target detection at an expected location at a cost of lower detection when the target appears at an unexpected location (see Figure 3c for a car at an unexpected location missed by a model that incorporates scene context). The models can account for the influence of context on eye movements during search with real scenes (Eckstein et al. 2006, Kanan et al. 2009, Torralba et al. 2006), but other models with different attentional mechanisms can also be proposed to account for the results. Perhaps, the strongest evidence in favor of the proposed probabilistic computations is a signature behavior that is distinct from models with limited resources4 but present in humans and Bayesian-type computations: a general bias to saccade to expected target locations even in the absence of the target (Figure 5b) (Eckstein et al. 2006; Koehler & Eckstein 2017a,b; Malcolm & Henderson 2010).

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

6. NEURAL COMPUTATIONS THAT TAKE INTO ACCOUNT PROBABILISTIC RELATIONSHIPS What are the types of computations the brain needs to implement to take into account probabilistic relationships? This is a topic of much debate. Does the brain require a full probabilistic calculation (Figure 2 and Figure 6, model )? For many simple search tasks, one can show that optimal computations can be approximated by simpler operations such as taking the maximum among a number of scalar variables representing sensory evidence associated to different locations (Figure 6, model ) (Nolte & Jaarsma 1967, Palmer et al. 2000, Pelli 1985). Similarly, for cueing tasks, the

4 Model implementation of limited resources mediating covert attention would predict that allocating the increasing attentional resources at an expected target location should allow improved rejection of that location when the target is absent. Thus, initial eye movements would not be directed to such location.

332

Eckstein

VS03CH14-Eckstein

ARI

11 August 2017

11:25

a

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

Full cue

O

M

B

No cue

b

Coffee cup

Painting

Saddle

Chimney

Figure 5 (a) Example of search for a bottle cork with individual scene context cues (Koehler & Eckstein 2017a). Full cue: all contextual cues; O: object co-occurrence (the wine bottle); B: background; M: multiple-object configuration (the configuration of other objects is preserved, but the co-occurring object, the wine bottle, is not present); no cue: no contextual cues in the scene. (b) First saccade endpoints (blue) for different individuals looking for objects in scenes. In red, explicit reports about expected location for the target in the scene for an independent group of observers viewing images without the target.

www.annualreviews.org • Probabilistic Computations for Attention

333

VS03CH14-Eckstein

ARI

11 August 2017

11:25

Sensory variables

Likelihood

Prior probability πj

Summation within H

Decision across H

xj

P(xj |Hi )

×



MAX

+

1 πj

+

xj

P(xj |Hi )

×

MAX

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

2 bj

3

xj

+

MAX

Figure 6 Simple schematic of possible computations to take into account probabilistic relationships between targets and cues. () An ideal observer utilizes information about the expected distribution of the sensory variables to calculate a likelihood of sensory variables, given each considered hypothesis (Hi ). The ideal observer multiplies the likelihoods for each location or stimulus by a prior probability based on cues and sums the weighted likelihoods across all possible display states within each hypothesis. It finally takes the maximum across the hypotheses to make a decision on each trial. () A simpler model that utilizes the maximum across all likelihoods to make decisions about the hypotheses without the summation stage. () The simplest model in which no likelihood is calculated, the sensory variable is modulated by an additive term based on cue predictive probability (and inversely to the target’s detectability/discriminability), and a maximum of the variables is used to make a decision.

influence of the predictive cues can be implemented in terms of an additive term to the sensory evidence at cue locations (Figure 6, model ). Thus, the additive term modulates or favors sensory variables associated with locations or cues with a higher probability of co-occurrence with the target. This additive term is related to the logarithm of the cue validity and normalized by the target detectability or discriminability from distractors (Eckstein et al. 2009, 2013; Gold & Shadlen 2007). When the target is very easy to detect or discriminate (high detectability), the cue-related additive term decreases its influence on the decision. In contrast, when the target is difficult to detect or discriminate from distractors, the cue-related additive term increases and has a large influence on the decision. For more complex tasks, the simplest model, in which a maximum of decision variables is utilized (Figure 6, model ), becomes highly suboptimal (Ma et al. 2015, Schoonveld et al. 2007). Such tasks include (a) search in which the variability (magnitude of statistical noise) changes across search elements (Ma et al. 2011), (b) tasks for which the observer is uncertain about the direction of change of the target from distractors (the target might be a luminance increment or decrement, or an orientation-defined target could be tilted clockwise or counterclockwise) (Baldassi & Verghese 2002, Cameron et al. 2004), and (c) uncertainty in the physical properties of both the target and distractors (for example, in oddity search, the observer knows that the target will be different from distractors but does not know along which feature) 334

Eckstein

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

ARI

11 August 2017

11:25

(Schoonveld et al. 2007). Optimal decisions for these tasks require more elaborate computations encoding the possible expected mean and variance (reliability) of each sensory variable (computing the likelihood; Figure 6, models  and ) and summation of likelihoods across all possible instances of the stimulus (Figure 6, model ) (Eckstein et al. 2009, Green & Swets 1989, Ma 2012, Ma et al. 2015, Peterson et al. 1954, Schoonveld et al. 2007, Vincent 2015). Neurally plausible estimations of the variance of sensory variables (i.e., reliability) have been proposed considering a population of neurons and taking into account the width of the distribution of likelihoods across the population (Ma et al. 2011). In addition, various studies have shown that different instances of an ideal observer with different degrees of complexity can be implemented in a neural network (Edwards et al. 2002, Kupinski et al. 2001, Ma et al. 2011, Myers et al. 1995). Where in the brain would such computations be implemented? Distinguishing underlying, different types of cue-related computations from neuronal activity can be challenging. Simple measures such as mean firing rate can be consistent with many different theoretical models (Eckstein et al. 2009). Discriminating across theories requires identifying where the probed neuron lies within the multistage computations of the model and also quantifying a variety of summary statistics for the neuronal activity (Eckstein et al. 2009). Recent studies suggest that the superior colliculus, involved in both eye movement and covert attention, might be a leading candidate implementing computations that weight sensory inputs on the basis of probabilistic relationships between cues/locations and targets to optimize decisions and actions (Krauzlis et al. 2013, 2014; Sridharan et al. 2017).

7. FURTHER ISSUES This review has focused on how organisms’ (biological and artificial) perceptual decisions and eye movements can benefit from utilizing computations that take into account the probabilistic relationships between the searched target and other elements of the display or scene. The benefits are available even in the absence of considering limited attentional resources and are in addition to the contributions from foveation. Such probabilistic computations can be framed using the theory of the Bayesian ideal observer. What is appealing about the framework? In many subfields of the physical sciences, there has been a strong tradition of pursuing unified frameworks that aim to explain a number of different phenomena under a common theory (e.g., the ambitious standard theory in physics aims to unify the electromagnetic, strong, and weak forces under a common framework). In the field of covert attention, it has been common for investigators to propose a new attentional mechanism to explain results from each new experimental paradigm. This has led to a multiplicity of metaphors and attention mechanisms. In this context, the merits of the probabilistic framework (Bayesian ideal observer theory) to study search and attention are (a) it can explain a variety of experimental results spanning from synthetic tasks to search in scenes, (b) it can be framed in terms of biologically plausible neuronal units, and (c) it can connect computationally to the field of computer vision. Is the framework able to account for all attentional benefits to human performance? No. First, the term attention has been used for a wider range of phenomena not covered in this review (Carrasco 2011, Dosher & Lu 2013). Many studies have also shown interesting benefits for the cueing and search tasks beyond what would be predicted by these types of models (Carrasco 2006, 2011; Davis et al. 2003; Luck et al. 1996; Palmer et al. 2011). In particular, tasks followed by visual masks, relying on short-term memory and/or a temporal sequence of cues with varying validity (Dosher & Lu 2000, Lu & Dosher 1998), and tasks with higher complexity have been shown to lead to larger performance benefits than predicted by the theory. In addition, physiological studies www.annualreviews.org • Probabilistic Computations for Attention

335

ARI

11 August 2017

11:25

have shown increases in neuronal sensitivity as a separate attentional mechanism (Luo & Maunsell 2015, Maunsell & Cook 2002). For some time, much of the focus in the field has been to show that a behavioral experiment shows an additional performance improvement or decrement beyond what is predicted by these probabilistic models and then stop the endeavor there. A worthwhile path moving forward would be to partition performance improvements in search related to (a) computations that take into account probabilistic relationships in the environment, (b) allocation of covert attentional resources, and (c) the process of foveation. This would provide a more global view of how different covert attention and overt mechanisms contribute to performance in real-world search. Finally, aside from probabilistic relationships, there are certainly other important strategies the brain has implemented to ensure successful search (Eckstein 2011). For example, in recent years, many studies have highlighted the role of rewards in shaping eye movements and interacting with attention (Ackermann & Landy 2013, Eckstein et al. 2015, Foley et al. 2017, Hayhoe & Ballard 2014, Sullivan et al. 2012). There has also been a number of studies showing that eye-movement planning by humans takes into account the visibility of the target across the visual field to plan fixations that maximize the acquisition of information to support subsequent perceptual decisions (Najemnik & Geisler 2005, 2009; Renninger et al. 2005, 2007).

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

DISCLOSURE STATEMENT The author is not aware of any affiliations, memberships, funding, or financial holdings that might be perceived as affecting the objectivity of this review. LITERATURE CITED Ackermann JF, Landy MS. 2013. Choice of saccade endpoint under risk. J. Vis. 13(3):27. https://doi.org/ 10.1167/13.3.27 Akbas E, Eckstein MP. 2014. Object detection through exploration with a foveated visual field. arXiv 1408.0814 [cs.CV] Araujo C, Kowler E, Pavel M. 2001. Eye movements during visual search: the costs of choosing the optimal path. Vis. Res. 41(25–26):3613–25 Azzopardi P, Cowey A. 1993. Preferential representation of the fovea in the primary visual cortex. Nature 361(6414):719–21. https://doi.org/10.1038/361719a0 Bacon WF, Egeth HE. 1991. Local processes in preattentive feature detection. J. Exp. Psychol. Hum. Percept. Perform. 17(1):77–90 Baldassi S, Verghese P. 2002. Comparing integration rules in visual search. J. Vis. 2(8):3. https://doi.org/ 10.1167/2.8.3 Barlow HB. 1980. The absolute efficiency of perceptual decisions. Philos. Trans. R. Soc. B 290(1038):71–82 Biederman I. 1972. Perceiving real-world scenes. Science 177(4043):77–80. https://doi.org/10.1126/ science.177.4043.77 Bravo MJ, Farid H. 2009. The specificity of the search template. J. Vis. 9(1):34. https://doi.org/ 10.1167/9.1.34 Bravo MJ, Nakayama K. 1992. The role of attention in different visual-search tasks. Percept. Psychophys. 51(5):465–72 Brockmole JR, Henderson JM. 2006. Using real-world scenes as contextual cues for search. Vis. Cogn. 13(1):99– 108. https://doi.org/10.1080/13506280500165188 Bulf H, Johnson SP, Valenza E. 2011. Visual statistical learning in the newborn infant. Cognition 121(1):127–32. https://doi.org/10.1016/j.cognition.2011.06.010 Bundesen C. 1990. A theory of visual attention. Psychol. Rev. 97(4):523–47 Bushnell PJ, Rice DC. 1999. Behavioral assessments of learning and attention in rats exposed perinatally to 3,3 ,4,4 ,5-pentachlorobiphenyl (PCB 126). Neurotoxicol. Teratol. 21(4):381–92 336

Eckstein

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

ARI

11 August 2017

11:25

Cameron EL, Tai JC, Eckstein MP, Carrasco M. 2004. Signal detection theory applied to three visual search tasks—identification, yes/no detection and localization. Spat. Vis. 17(4–5):295–325 Carrasco M. 2006. Covert attention increases contrast sensitivity: psychophysical, neurophysiological and neuroimaging studies. Prog. Brain Res. 154:33–70. https://doi.org/10.1016/S0079-6123(06)54003-8 Carrasco M. 2011. Visual attention: the past 25 years. Vis. Res. 51(13):1484–525. https://doi.org/10.1016/ j.visres.2011.04.012 Castelhano MS, Heaven C. 2011. Scene context influences without scene gist: eye movements guided by spatial associations in visual search. Psychon. Bull. Rev. 18(5):890–96. https://doi.org/10.3758/ s13423-011-0107-8 Castelhano MS, Henderson JM. 2007. Initial scene representations facilitate eye movement guidance in visual search. J. Exp. Psychol. Hum. Percept. Perform. 33(4):753 Choi MJ, Torralba A, Willsky AS. 2012. A tree-based context model for object recognition. IEEE Trans. Pattern Anal. Mach. Intel. 34(2):240–52. https://doi.org/10.1109/TPAMI.2011.119 Cohn TE, Lasley DJ. 1974. Detectability of a luminance increment: effect of spatial uncertainty. J. Opt. Soc. Am. 64(12):1715–19 Curcio CA, Sloan KR, Kalina RE, Hendrickson AE. 1990. Human photoreceptor topography. J. Comp. Neurol. 292(4):497–523. https://doi.org/10.1002/cne.902920402 Davenport JL, Potter MC. 2004. Scene consistency in object and background perception. Psychol. Sci. 15(8):559–64 Davis ET, Kramer P, Graham N. 1983. Uncertainty about spatial frequency, spatial position, or contrast of visual patterns. Percep. Psychophys. 33(1):20–28 Davis ET, Shikano T, Main K, Hailston K, Michel RK, Sathian K. 2006. Mirror-image symmetry and search asymmetry: A comparison of their effects on visual search and a possible unifying explanation. Vis. Res. 46(8–9):1263–81. https://doi.org/10.1016/j.visres.2005.10.032 Davis ET, Shikano T, Peterson SA, Keyes Michel R. 2003. Divided attention and visual search for simple versus complex features. Vis. Res. 43(21):2213–32 de Bivort BL, van Swinderen B. 2016. Evidence for selective attention in the insect brain. Curr. Opin. Insect Sci. 15:9–15. https://doi.org/10.1016/j.cois.2016.02.007 Dosher BA, Lu Z-L. 2000. Mechanisms of perceptual attention in precuing of location. Vis. Res. 40(10– 12):1269–92 Dosher BA, Lu Z-L. 2013. Mechanisms of visual attention. In Human Information Processing: Vision, Memory, and Attention, ed. C Chubb, BA Dosher, Z-L Lu, RE Shiffrin, pp. 149–64. Washington, DC: Am. Psych. Assoc. Droll JA, Abbey CK, Eckstein MP. 2009. Learning cue validity through performance feedback. J. Vis. 9(2):18. https://doi.org/10.1167/9.2.18 Druker M, Anderson B. 2010. Spatial probability aids visual stimulus discrimination. Front. Hum. Neurosci. 4:1–10. https://doi.org/10.3389/fnhum.2010.00063 Duncan J, Humphreys GW. 1989. Visual search and stimulus similarity. Psychol. Rev. 96(3):433–58 Eckstein MP. 1998. The lower visual search efficiency for conjunctions is due to noise and not serial attentional processing. Psychol. Sci. 9(2):111–18 Eckstein MP. 2011. Visual search: a retrospective. J. Vis. 11(5):14. https://doi.org/10.1167/11.5.14 Eckstein MP, Beutter BR, Pham BT, Shimozaki SS, Stone LS. 2007. Similar neural representations of the target for saccades and perception during search. J. Neurosci. 27(6):1266–70. https://doi.org/10.1523/ JNEUROSCI.3975-06.2007 Eckstein MP, Beutter BR, Stone LS. 2001. Quantifying the performance limits of human saccadic targeting during visual search. Perception 30(11):1389–401 Eckstein MP, Drescher BA, Shimozaki SS. 2006. Attentional cues in real scenes, saccadic targeting, and Bayesian priors. Psychol. Sci. 17(11):973–80. https://doi.org/10.1111/j.1467-9280.2006.01815.x Eckstein MP, Mack SC, Liston DB, Bogush L, Menzel R, Krauzlis RJ. 2013. Rethinking human visual attention: spatial cueing effects and optimality of decisions by honeybees, monkeys and humans. Vis. Res. 85:5–19. https://doi.org/10.1016/j.visres.2012.12.011 Eckstein MP, Peterson MF, Pham BT, Droll JA. 2009. Statistical decision theory to relate neurons to behavior in the study of covert visual attention. Vis. Res. 49(10):1097 www.annualreviews.org • Probabilistic Computations for Attention

337

ARI

11 August 2017

11:25

Eckstein MP, Schoonveld W, Zhang S, Mack SC, Akbas E. 2015. Optimal and human eye movements to clustered low value cues to increase decision rewards during search. Vis. Res. 113:137–54. https://doi.org/10.1016/j.visres.2015.05.016 Eckstein MP, Shimozaki SS, Abbey CK. 2002. The footprints of visual attention in the Posner cueing paradigm revealed by classification images. J. Vis. 2(1):3. https://doi.org/10.1167/2.1.3 Eckstein MP, Thomas JP, Palmer J, Shimozaki SS. 2000. A signal detection model predicts the effects of set size on visual search accuracy for feature, conjunction, triple conjunction, and disjunction displays. Percept. Psychophys. 62(3):425–51 Edwards DC, Metz CE, Nishikawa RM. 2002. Estimation of three-class ideal observer decision functions with a Bayesian artificial neural network. Proc. SPIE 4686:1–12. https://doi.org/10.1117/12.462662 Ernst MO, Banks MS. 2002. Humans integrate visual and haptic information in a statistically optimal fashion. Nature 415(6870):429–33. https://doi.org/10.1038/415429a Faisal AA, Selen LPJ, Wolpert DM. 2008. Noise in the nervous system. Nat. Rev. Neurosci. 9(4):292–303. https://doi.org/10.1038/nrn2258 Fei-Fei L, Iyer A, Koch C, Perona P. 2007. What do we perceive in a glance of a real-world scene? J. Vis. 7(1):10. https://doi.org/10.1167/7.1.10 Findlay JM. 1997. Saccade target selection during visual search. Vis. Res. 37(5):617–31. https://doi.org/ 10.1016/S0042-6989(96)00218-0 Findlay JM, Gilchrist ID. 2003. Active Vision: The Psychology of Looking and Seeing. New York: Oxford Univ. Press. 1st ed. Fiser J, Aslin RN. 2002. Statistical learning of new visual feature combinations by infants. PNAS 99(24):15822– 26. https://doi.org/10.1073/pnas.232472899 Foley NC, Kelly SP, Mhatre H, Lopes M, Gottlieb J. 2017. Parietal neurons encode expected gains in instrumental information. PNAS 114(16):E3315–23. https://doi.org/10.1073/pnas.1613844114 Geisler WS. 2011. Contributions of ideal observer theory to vision research. Vis. Res. 51(7):771–81. https://doi.org/10.1016/j.visres.2010.09.027 Geisler WS, Chou K-L. 1995. Separation of low-level and high-level factors in complex tasks: Visual search. Psychol. Rev. 102:356–78. https://doi.org/10.1037/0033-295X.102.2.356 Gekas N, Seitz AR, Seri`es P. 2015. Expectations developed over multiple timescales facilitate visual search performance. J. Vis. 15(9):10. https://doi.org/10.1167/15.9.10 Geng JJ, Behrmann M. 2005. Spatial probability as an attentional cue in visual search. Percept. Psychophys. 67(7):1252–68 Gibson BM, Leber AB, Mehlman ML. 2015. Spatial context learning in pigeons (Columba livia). J. Exp. Psychol. Anim. Learn. Cogn. 41(4):336–42. https://doi.org/10.1037/xan0000068 Gold JI, Shadlen MN. 2007. The neural basis of decision making. Annu. Rev. Neurosci. 30:535–74. https://doi.org/10.1146/annurev.neuro.29.051605.113038 Green DM, Swets JA. 1989. Signal Detection Theory and Psychophysics. Los Altos, CA: Peninsula Publ. Greene MR. 2016. Estimations of object frequency are frequently overestimated. Cognition 149:6–10. https://doi.org/10.1016/j.cognition.2015.12.011 Greene MR, Oliva A. 2009. Recognition of natural scenes from global properties: seeing the forest without representing the trees. Cogn. Psychol. 58(2):137–76. https://doi.org/10.1016/j.cogpsych.2008.06.001 Hayhoe M, Ballard D. 2014. Modeling task control of eye movements. Curr. Biol. 24(13):R622–28. https://doi.org/10.1016/j.cub.2014.05.020 Helbig HB, Ernst MO. 2007. Optimal integration of shape information from vision and touch. Exp. Brain Res. 179(4):595–606. https://doi.org/10.1007/s00221-006-0814-y Henderson JM, Malcolm GL, Schandl C. 2009. Searching in the dark: cognitive relevance drives attention in real-world scenes. Psychon. Bull. Rev. 16(5):850–56. https://doi.org/10.3758/PBR.16.5.850 Hillis JM, Watt SJ, Landy MS, Banks MS. 2004. Slant from texture and disparity cues: optimal cue combination. J. Vis. 4(12):1. https://doi.org/10.1167/4.12.1 Jacobs RA. 1999. Optimal integration of texture and motion cues to depth. Vis. Res. 39(21):3621–29 Jiang YV, Won B-Y, Swallow KM. 2014. First saccadic eye movement reveals persistent attentional guidance by implicit learning. J. Exp. Psychol. Hum. Percept. Perform. 40(3):1161–73. https://doi.org/ 10.1037/a0035961

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

338

Eckstein

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

ARI

11 August 2017

11:25

Jones JL, Kaschak MP. 2012. Global statistical learning in a visual search task. J. Exp. Psychol. Hum. Percept. Perform. 38(1):152–60. https://doi.org/10.1037/a0026233 Kanan C, Tong MH, Zhang L, Cottrell GW. 2009. SUN: top-down saliency using natural statistics. Vis. Cogn. 17(6–7):979–1003. https://doi.org/10.1080/13506280902771138 Kantorov V, Oquab M, Cho M, Laptev I. 2016. ContextLocNet: context-aware deep network models for weakly supervised localization. Proc. Eur. Conf. Comp. Vis., Amsterdam, Neth., Oct. 11–14, pp. 350–65. Cham, Switz.: Springer. https://doi.org/10.1007/978-3-319-46454-1_22 Knill DC, Pouget A. 2004. The Bayesian brain: the role of uncertainty in neural coding and computation. Trends Neurosci. 27(12):712–19. https://doi.org/10.1016/j.tins.2004.10.007 Koehler K, Eckstein MP. 2017a. Beyond scene gist: Objects guide search more than backgrounds. J. Exp. Psychol. Hum. Percept. Perform. 43(6):1177–93. https://doi.org/10.1037/xhp0000363 Koehler K, Eckstein MP. 2017b. Temporal and peripheral extraction of contextual cues from scenes during visual search. J. Vis. 7(2):16. https://doi.org/10.1167/17.2.16 Koehler K, Guo F, Zhang S, Eckstein MP. 2014. What do saliency models predict? J. Vis. 14(3):14. https://doi. org/10.1167/14.3.14 Krauzlis RJ, Bollimunta A, Arcizet F, Wang L. 2014. Attention as an effect not a cause. Trends Cogn. Sci. 18(9):457–64. https://doi.org/10.1016/j.tics.2014.05.008 Krauzlis RJ, Lovejoy LP, Z´enon A. 2013. Superior colliculus and visual spatial attention. Annu. Rev. Neurosci. 36:165–82. https://doi.org/10.1146/annurev-neuro-062012-170249 Kupinski MA, Edwards DC, Giger ML, Metz CE. 2001. Ideal observer approximation using Bayesian classification neural networks. IEEE Trans. Med. Imaging 20(9):886–99. https://doi.org/10.1109/42.952727 Landy MS, Banks MS, Knill DC. 2011. Ideal-observer models of cue integration. In Sensory Cue Integration, ed. J. Trommersh¨auser, K. Kording, MS Landy, pp. 5–29. New York: Oxford Univ. Press Landy MS, Kojima H. 2001. Ideal cue combination for localizing texture-defined edges. J. Opt. Soc. Am. A 18(9):2307–20 Landy MS, Maloney LT, Johnston EB, Young M. 1995. Measurement and modeling of depth cue combination: in defense of weak fusion. Vis. Res. 35(3):389–412 Larson AM, Loschky LC. 2009. The contributions of central versus peripheral vision to scene gist recognition. J. Vis. 9(10):6. https://doi.org/10.1167/9.10.6 Levi DM. 2008. Crowding—an essential bottleneck for object recognition: a mini-review. Vis. Res. 48(5):635– 54. https://doi.org/10.1016/j.visres.2007.12.009 Liston DB, Stone LS. 2008. Effects of prior information and reward on oculomotor and perceptual choices. J. Neurosci. 28(51):13866–75. https://doi.org/10.1523/JNEUROSCI.3120-08.2008 Lu Z-L, Dosher BA. 1998. External noise distinguishes attention mechanisms. Vis. Res. 38(9):1183–98. https://doi.org/10.1016/S0042-6989(97)00273-3 Luck SJ, Hillyard SA, Mouloua M, Hawkins HL. 1996. Mechanisms of visual-spatial attention: resource allocation or uncertainty reduction? J. Exp. Psychol. Hum. Percept. Perform. 22(3):725–37 Luo TZ, Maunsell JHR. 2015. Neuronal modulations in visual cortex are associated with only one of multiple components of attention. Neuron 86(5):1182–88. https://doi.org/10.1016/j.neuron.2015.05.007 Ma WJ. 2012. Organizing probabilistic models of perception. Trends Cogn. Sci. 16(10):511–18. https://doi.org/ 10.1016/j.tics.2012.08.010 Ma WJ, Navalpakkam V, Beck JM, Berg RVD, Pouget A. 2011. Behavior and neural basis of near-optimal visual search. Nat. Neurosci. 14(6):783–90. https://doi.org/10.1038/nn.2814 Ma WJ, Shen S, Dziugaite G, van den Berg R. 2015. Requiem for the max rule? Vis. Res. 116(B):179–93. https://doi.org/10.1016/j.visres.2014.12.019 Mack SC, Eckstein MP. 2011. Object co-occurrence serves as a contextual cue to guide and facilitate visual search in a natural viewing environment. J. Vis. 11(9):9. https://doi.org/10.1167/11.9.9 Malcolm GL, Henderson JM. 2009. The effects of target template specificity on visual search in real-world scenes: evidence from eye movements. J. Vis. 9(11):8. https://doi.org/10.1167/9.11.8 Malcolm GL, Henderson JM. 2010. Combining top-down processes to guide eye movements during realworld scene search. J. Vis. 10(2):4. https://doi.org/10.1167/10.2.4 Marote CFO, Xavier GF. 2011. Endogenous-like orienting of visual attention in rats. Anim. Cogn. 14(4):535– 44. https://doi.org/10.1007/s10071-011-0388-3 www.annualreviews.org • Probabilistic Computations for Attention

339

ARI

11 August 2017

11:25

Maunsell JHR, Cook EP. 2002. The role of attention in visual processing. Philos. Trans. R. Soc. B 357(1424):1063–72. https://doi.org/10.1098/rstb.2002.1107 Mottaghi R, Chen X, Liu X, Cho N-G, Lee S-W, et al. 2014. The role of context for object detection and semantic segmentation in the wild. Proc. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, June 23–28, pp. 891–898. https://doi.org/10.1109/CVPR.2014.119 Myers KJ, Anderson MP, Brown DG, Wagner RF, Hanson KM. 1995. Neural network performance for binary discrimination tasks. Part II: effect of task, training, and feature preselection. Proc. SPIE 2434:828 Najemnik J, Geisler WS. 2005. Optimal eye movement strategies in visual search. Nature 434(7031):387–91. https://doi.org/10.1038/nature03390 Najemnik J, Geisler WS. 2009. Simple summation rule for optimal fixation selection in visual search. Vis. Res. 49(10):1286–94. https://doi.org/10.1016/j.visres.2008.12.005 Navalpakkam V, Itti L. 2005. Modeling the influence of task on attention. Vis. Res. 45(2):205–31. https://doi. org/10.1016/j.visres.2004.07.042 Neider MB, Zelinsky GJ. 2006. Scene context guides eye movements during visual search. Vis. Res. 46(5):614– 21 Nityananda V. 2016. Attention-like processes in insects. Proc. R. Soc. B 283(1842): 20161986. https://doi.org/ 10.1098/rspb.2016.1986 Nolte LW, Jaarsma D. 1967. More on the detection of one of M orthogonal signals. J. Acoust. Soc. Am. 41:497–505 Oliva A, Torralba A. 2001. Modeling the shape of the scene: a holistic representation of the spatial envelope. Int. J. Comput. Vis. 42(3):145–75. https://doi.org/10.1023/A:1011139631724 Oliva A, Torralba A. 2006. Building the gist of a scene: the role of global image features in recognition. Prog. Brain Res. 155:23–36 Oliva A, Torralba A. 2007. The role of context in object recognition. Trends Cogn. Sci. 11(12):520–27 Palmer EM, Fencsik DE, Flusberg SJ, Horowitz TS, Wolfe JM. 2011. Signal detection evidence for limited capacity in visual search. Attention Percept. Psychophys. 73(8):2413–24. https://doi.org/10.3758/ s13414-011-0199-2 Palmer J. 1994. Set-size effects in visual search: the effect of attention is independent of the stimulus for simple tasks. Vis. Res. 34(13):1703–21 Palmer J, Ames CT, Lindsey DT. 1993. Measuring the effect of attention on simple visual search. J. Exp. Psychol. Hum. Percept. Perform. 19(1):108–30. https://doi.org/10.1037/0096-1523.19.1.108 Palmer J, Verghese P, Pavel M. 2000. The psychophysics of visual search. Vis. Res. 40(10):1227–68 Palmer TE. 1975. The effects of contextual scenes on the identification of objects. Mem. Cogn. 3:519–26 Pelli DG. 1985. Uncertainty explains many aspects of visual contrast detection and discrimination. J. Opt. Soc. Am. A 2(9):1508–31. https://doi.org/10.1364/JOSAA.2.001508 Pereira EJ, Castelhano MS. 2014. Peripheral guidance in scenes: the interaction of scene context and object content. J. Exp. Psychol. Hum. Percept. Perform. 40(5):2056–72 Peterson MS, Kramer AF. 2001. Attentional guidance of the eyes by contextual information and abrupt onsets. Percept. Psychophys. 63(7):1239–49 Peterson W, Birdsall T, Fox W. 1954. The theory of signal detectability. Trans. IRE Prof. Group Inf. Theory 4(4):171–212. https://doi.org/10.1109/TIT.1954.1057460 Posner MI, Snyder CR, Davidson BJ. 1980. Attention and the detection of signals. J. Exp. Psychol. Gen. 109(2):160–74. https://doi.org/10.1037/0096-3445.109.2.160 Potter MC. 1976. Short-term conceptual memory for pictures. J. Exp. Psychol. Hum. Learn. Mem. 2(5):509 Rabinovich A, Vedaldi A, Galleguillos C, Wiewiora E, Belongie S. 2007. Objects in context. Proc. 2007 IEEE 11th Int. Conf. Comput. Vis., Rio de Janeiro, Braz., Oct. 14–21, pp. 1–8. https://doi.org/10.1109/ ICCV.2007.4408986 Ren S, He K, Girshick R, Sun J. 2016. Faster R-CNN: towards real-time object detection with region proposal networks. arXiv 1506.01497v3 [cs.CV] Renninger LW, Coughlan J, Verghese P, Malik J. 2005. An information maximization model of eye movements. Adv. Neural Inf. Process. Syst. 17:1121–28 Renninger LW, Verghese P, Coughlan J. 2007. Where to look next? Eye movements reduce local uncertainty. J. Vis. 7(3):6. https://doi.org/10.1167/7.3.6

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

340

Eckstein

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

ARI

11 August 2017

11:25

Rosenholtz R. 2001. Visual search for orientation among heterogeneous distractors: experimental results and implications for signal-detection theory models of search. J. Exp. Psychol. Hum. Percept. Perform. 27(4):985 Rosenholtz R. 2016. Capabilities and limitations of peripheral vision. Annu. Rev. Vis. Sci. 2:437–57 Rovamo J, Leinonen L, Laurinen P, Virsu V. 1984. Temporal integration and contrast sensitivity in foveal and peripheral vision. Perception 13(6):665–74. https://doi.org/10.1068/p130665 Santhi N, Reeves A. 2004. The roles of distractor noise and target certainty in search: a signal detection model. Vis. Res. 44(12):1235–56. https://doi.org/10.1016/j.visres.2003.11.011 Schoonveld W, Shimozaki SS, Eckstein MP. 2007. Optimal observer model of single-fixation oddity search predicts a shallow set-size function. J. Vis. 7(10):1. https://doi.org/10.1167/7.10.1 Schyns PG, Oliva A. 1994. From blobs to boundary edges: evidence for time- and spatial-scale-dependent scene recognition. Psychol. Sci. 5(4):195–200 Shadlen MN, Newsome WT. 1998. The variable discharge of cortical neurons: implications for connectivity, computation, and information coding. J. Neurosci. 18(10):3870–96 Shimozaki SS, Eckstein MP, Abbey CK. 2003. Comparison of two weighted integration models for the cueing task: linear and likelihood. J. Vis. 3(3):3. https://doi.org/10.1167/3.3.3 Shimozaki SS, Schoonveld WA, Eckstein MP. 2012. A unified Bayesian observer analysis for set size and cueing effects on perceptual decisions and saccades. J. Vis. 12(6):27. https://doi.org/10.1167/12.6.27 Shimp CP, Friedrich FJ. 1993. Behavioral and computational models of spatial attention. J. Exp. Psychol. Anim. Behav. Process. 19(1):26–37 Sperling G, Dosher BA. 1986. Strategy optimization in human information processing. In Handbook of Perception and Human Performance, Vol. 1, ed. KR Boff, L Kaufman, JP Thomas, pp. 1–65. New York: John Wiley and Sons Sridharan D, Steinmetz NA, Moore T, Knudsen EI. 2017. Does the superior colliculus control perceptual sensitivity or choice bias during attention? Evidence from a multialternative decision framework. J. Neurosci. 37(3):480–511. https://doi.org/10.1523/JNEUROSCI.4505-14.2017 Strasburger H, Rentschler I, Juttner M. 2011. Peripheral vision and pattern recognition: a review. J. Vis. ¨ 11(5):13. https://doi.org/10.1167/11.5.13 Sullivan BT, Johnson L, Rothkopf CA, Ballard D, Hayhoe M. 2012. The role of uncertainty and reward on eye movements in a virtual driving task. J. Vis. 12(13):19. https://doi.org/10.1167/12.13.19 Tatler BW, Hayhoe MM, Land MF, Ballard DH. 2011. Eye guidance in natural vision: reinterpreting salience. J. Vis. 11(5):5. https://doi.org/10.1167/11.5.5 Thorpe SJ, Gegenfurtner KR, Fabre-Thorpe M, Bulthoff HH. 2001. Detection of animals in nat¨ ural images using far peripheral vision. Eur. J. Neurosci. 14(5):869–76. https://doi.org/10.1046/ j.0953-816x.2001.01717.x Torralba A. 2003. Contextual priming for object detection. Int. J. Comput. Vis. 53(2):169–91. https://doi. org/10.1023/A:1023052124951 Torralba A, Murphy KP, Freeman WT. 2010. Using the forest to see the trees: exploiting context for visual object detection and localization. Commun. ACM 53(3):107–14. https://doi.org/10.1145/ 1666420.1666446 Torralba A, Murphy KP, Freeman WT, Rubin MA. 2003. Context-based vision system for place and object recognition. Proc. Ninth IEEE Int. Conf. Comput. Vis., Oct. 13–16, Nice, France, pp. 273–80. https://doi.org/10.1109/ICCV.2003.1238354 Torralba A, Oliva A, Castelhano MS, Henderson JM. 2006. Contextual guidance of eye movements and attention in real-world scenes: the role of global features in object search. Psychol. Rev. 113(4):766–86. https://doi.org/10.1037/0033-295X.113.4.766 Treisman A, Gelade G. 1980. A feature-integration theory of attention. Cogn. Psychol. 12(1):97–136 Vincent BT. 2011a. Covert visual search: prior beliefs are optimally combined with sensory evidence. J. Vis. 11(13):25. https://doi.org/10.1167/11.13.25 Vincent BT. 2011b. Search asymmetries: parallel processing of uncertain sensory information. Vis. Res. 51(15):1741–50. https://doi.org/10.1016/j.visres.2011.05.017 Vincent BT. 2015. Bayesian accounts of covert selective attention: a tutorial review. Atten. Percept. Psychophys. 77(4):1013–32. https://doi.org/10.3758/s13414-014-0830-0 www.annualreviews.org • Probabilistic Computations for Attention

341

ARI

11 August 2017

11:25

Vincent BT, Baddeley RJ, Troscianko T, Gilchrist ID. 2009. Optimal feature integration in visual search. J. Vis. 9(5):15. https://doi.org/10.1167/9.5.15 Vo˜ ML-H, Henderson JM. 2010. The time course of initial scene processing for eye movement guidance in natural scene search. J. Vis. 10(3):14. https://doi.org/10.1167/10.3.14 Walthew C, Gilchrist ID. 2006. Target location probability effects in visual search: an effect of sequential dependencies. J. Exp. Psychol. Hum. Percept. Perform. 32(5):1294–301. https://doi.org/10.1037/ 0096-1523.32.5.1294 Wasserman EA, Teng Y, Brooks DI. 2014a. Scene-based contextual cueing in pigeons. J. Exp. Psychol. Anim. Learn. Cogn. 40(4):401–18. https://doi.org/10.1037/xan0000028 Wasserman EA, Teng Y, Castro L. 2014b. Pigeons exhibit contextual cueing to both simple and complex backgrounds. Behav. Process. 104:44–52. https://doi.org/10.1016/j.beproc.2014.01.021 Wolfe JM. 1998a. Visual Search. In Attention, ed. HE Pashler, pp. 13–56. East Sussex, UK: Psychology Press Wolfe JM. 1998b. What can 1 million trials tell us about visual search? Psychol. Sci. 9(1):33–39 Wolfe JM. 2007. Guided Search 4.0: current progress with a model of visual search. Integr. Models Cogn. Syst. 25:1–57 Wolfe JM, Vo˜ ML-H, Evans KK, Greene MR. 2011. Visual search in scenes involves selective and nonselective pathways. Trends Cogn. Sci. 15(2):77–84. https://doi.org/10.1016/j.tics.2010.12.001 Wu C-C, Wang H-C, Pomplun M. 2014. The roles of scene gist and spatial dependency among objects in the semantic guidance of attention in real-world scenes. Vis. Res. 105:10–20. https://doi.org/10.1016/ j.visres.2014.08.019 Yu AJ, Dayan P. 2005. Uncertainty, neuromodulation, and attention. Neuron 46(4):681–92. https://doi.org/ 10.1016/j.neuron.2005.04.026 Zeiler MD, Fergus R. 2014. Visualizing and understanding convolutional networks. Proc. Eur. Conf. Comp. Vis., Zurich, Switz., Sept. 6–12, ed. D Fleet, T Pajdla, B Schiele, T Tuytelaars, pp. 818–33. Cham, Switz.: Springer. https://doi.org/10.1007/978-3-319-10590-1_53 Zelinsky GJ. 2008. A theory of eye movements during target acquisition. Psychol. Rev. 115(4):787–835. https://doi.org/10.1037/a0013118 Zhang S, Eckstein MP. 2010. Evolution and optimality of similar neural mechanisms for perception and action during search. PLOS Comput. Biol. 6(9):e1000930. https://doi.org/10.1371/journal.pcbi.1000930

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

VS03CH14-Eckstein

342

Eckstein

ANNUAL REVIEWS Connect With Our Experts

New From Annual Reviews:

ONLINE NOW!

Annual Review of Cancer Biology

cancerbio.annualreviews.org • Volume 1 • March 2017

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

Co-Editors: Tyler Jacks, Massachusetts Institute of Technology Charles L. Sawyers, Memorial Sloan Kettering Cancer Center The Annual Review of Cancer Biology reviews a range of subjects representing important and emerging areas in the field of cancer research. The Annual Review of Cancer Biology includes three broad themes: Cancer Cell Biology, Tumorigenesis and Cancer Progression, and Translational Cancer Science.

TABLE OF CONTENTS FOR VOLUME 1:

• How Tumor Virology Evolved into Cancer Biology and Transformed Oncology, Harold Varmus • The Role of Autophagy in Cancer, Naiara Santana-Codina, Joseph D. Mancias, Alec C. Kimmelman • Cell Cycle–Targeted Cancer Therapies, Charles J. Sherr, Jiri Bartek • Ubiquitin in Cell-Cycle Regulation and Dysregulation in Cancer, Natalie A. Borg, Vishva M. Dixit • The Two Faces of Reactive Oxygen Species in Cancer, Colleen R. Reczek, Navdeep S. Chandel • Analyzing Tumor Metabolism In Vivo, Brandon Faubert, Ralph J. DeBerardinis • Stress-Induced Mutagenesis: Implications in Cancer and Drug Resistance, Devon M. Fitzgerald, P.J. Hastings, Susan M. Rosenberg • Synthetic Lethality in Cancer Therapeutics, Roderick L. Beijersbergen, Lodewyk F.A. Wessels, René Bernards • Noncoding RNAs in Cancer Development, Chao-Po Lin, Lin He • p53: Multiple Facets of a Rubik’s Cube, Yun Zhang, Guillermina Lozano • Resisting Resistance, Ivana Bozic, Martin A. Nowak • Deciphering Genetic Intratumor Heterogeneity and Its Impact on Cancer Evolution, Rachel Rosenthal, Nicholas McGranahan, Javier Herrero, Charles Swanton

• Immune-Suppressing Cellular Elements of the Tumor Microenvironment, Douglas T. Fearon • Overcoming On-Target Resistance to Tyrosine Kinase Inhibitors in Lung Cancer, Ibiayi Dagogo-Jack, Jeffrey A. Engelman, Alice T. Shaw • Apoptosis and Cancer, Anthony Letai • Chemical Carcinogenesis Models of Cancer: Back to the Future, Melissa Q. McCreery, Allan Balmain • Extracellular Matrix Remodeling and Stiffening Modulate Tumor Phenotype and Treatment Response, Jennifer L. Leight, Allison P. Drain, Valerie M. Weaver • Aneuploidy in Cancer: Seq-ing Answers to Old Questions, Kristin A. Knouse, Teresa Davoli, Stephen J. Elledge, Angelika Amon • The Role of Chromatin-Associated Proteins in Cancer, Kristian Helin, Saverio Minucci • Targeted Differentiation Therapy with Mutant IDH Inhibitors: Early Experiences and Parallels with Other Differentiation Agents, Eytan Stein, Katharine Yen • Determinants of Organotropic Metastasis, Heath A. Smith, Yibin Kang • Multiple Roles for the MLL/COMPASS Family in the Epigenetic Regulation of Gene Expression and in Cancer, Joshua J. Meeks, Ali Shilatifard • Chimeric Antigen Receptors: A Paradigm Shift in Immunotherapy, Michel Sadelain

ANNUAL REVIEWS | CONNECT WITH OUR EXPERTS 650.493.4400/800.523.8635 (us/can) www.annualreviews.org | [email protected]

VS03-FrontMatter

ARI

11 August 2017

10:5

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

Contents

Annual Review of Vision Science Volume 3, 2017

Inhibitory Interneurons in the Retina: Types, Circuitry, and Function Jeffrey S. Diamond p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 1 The Transduction Cascade in Retinal ON-Bipolar Cells: Signal Processing and Disease Kirill A. Martemyanov and Alapakkam P. Sampath p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p25 International Vision Care: Issues and Approaches Rohit C. Khanna, Srinivas Marmamula, and Gullapalli N. Rao p p p p p p p p p p p p p p p p p p p p p p p p p p53 EK (DLEK, DSEK, DMEK): New Frontier in Cornea Surgery Marianne O. Price, Pankaj Gupta, Jonathan Lass, and Francis W. Price, Jr. p p p p p p p p p p p p69 Neuroprotection in Glaucoma: Animal Models and Clinical Trials Mohammadali Almasieh and Leonard A. Levin p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p91 Vectors and Gene Delivery to the Retina Arthur Planul and Deniz Dalkara p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 121 Electrical Stimulation of Visual Cortex: Relevance for the Development of Visual Cortical Prosthetics William H. Bosking, Michael S. Beauchamp, and Daniel Yoshor p p p p p p p p p p p p p p p p p p p p p p p p 141 The Functional Neuroanatomy of Human Face Perception Kalanit Grill-Spector, Kevin S. Weiner, Kendrick Kay, and Jesse Gomez p p p p p p p p p p p p p p p 167 Circuits for Action and Cognition: A View from the Superior Colliculus Michele A. Basso and Paul J. May p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 197 Visual Decision-Making in an Uncertain and Dynamic World Joshua I. Gold and Alan A. Stocker p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 227 Higher-Order Areas of the Mouse Visual Cortex Lindsey L. Glickfeld and Shawn R. Olsen p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 251 Textures as Probes of Visual Processing Jonathan D. Victor, Mary M. Conte, and Charles F. Chubb p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 275 Binocular Mechanisms of 3D Motion Processing Lawrence K. Cormack, Thaddeus B. Czuba, Jonas Kn¨oll, and Alexander C. Huk p p p p p p 297 v

VS03-FrontMatter

ARI

11 August 2017

10:5

Probabilistic Computations for Attention, Eye Movements, and Search Miguel P. Eckstein p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 319 Visual Perceptual Learning and Models Barbara Dosher and Zhong-Lin Lu p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 343 Material Perception Roland W. Fleming p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 365 Vision and Action Mary M. Hayhoe p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p p 389

Annu. Rev. Vis. Sci. 2017.3:319-342. Downloaded from www.annualreviews.org Access provided by University of Virginia on 10/06/17. For personal use only.

Errata An online log of corrections to Annual Review of Vision Science articles may be found at http://www.annualreviews.org/errata/vision

vi

Contents

Probabilistic Computations for Attention, Eye Movements, and Search.

The term visual attention immediately evokes the idea of limited resources, serial processing, or a zoom metaphor. But evidence has slowly accumulated...
4MB Sizes 0 Downloads 15 Views