Topics in Cognitive Science (2014) 1–7 Copyright © 2014 Cognitive Science Society, Inc. All rights reserved. ISSN:1756-8757 print / 1756-8765 online DOI: 10.1111/tops.12087

Cognitive Science as an Interface Between Rational and Mechanistic Explanation Nick Chater Warwick Business School, University of Warwick Received 29 March 2013; accepted 14 August 2014

Abstract Cognitive science views thought as computation; and computation, by its very nature, can be understood in both rational and mechanistic terms. In rational terms, a computation solves some information processing problem (e.g., mapping sensory information into a description of the external world; parsing a sentence; selecting among a set of possible actions). In mechanistic terms, a computation corresponds to causal chain of events in a physical device (in engineering context, a silicon chip; in biological context, the nervous system). The discipline is thus at the interface between two very different styles of explanation—as the papers in the current special issue well illustrate, it explores the interplay of rational and mechanistic forces. Keywords: Rationality; Mechanistic explanation; Cognitive science; Computation

1. Combining rational and mechanistic explanation: The big picture How should rational and mechanistic analysis be combined? Halpern, Pass, and Seeman (this issue) consider this question at a high level of abstraction. Mechanistic constraints are modeled as limitations on the runtime or memory requirements of potential cognitive algorithms; and the challenge for the agent is to choose the algorithm with the best performance, subject to those constraints. Halpern, Pass, and Seeman review and develop two types of approach to exploring how this can be done. Even at this high level of generality, they argue that some general characteristics of human thought can be predicted. For example, if it is assumed that there is a cost to absorbing new evidence (a mechanistic constraint), then, as evidence accumulates and we become increasingly sure of our conclusion, we should pay decreasing attention to further information. Eventually, the fixed costs of absorbing that information will exceed the decreasing expected benefit Correspondence should be sent to Nick Chater, Warwick Business School, University of Warwick, Coventry CV4 7AL, UK. E-mail: [email protected]

2

N. Chater / Topics in Cognitive Science (2014)

from keeping open the possibility that new evidence may overturn our current beliefs. This tendency to “jump to conclusions” based on limited evidence, and hence to overweight initial evidence is, indeed, widely observed empirically (e.g., Peterson & DuCharme, 1967). Similarly, confirmation bias (the tendency to search for confirmatory evidence for one’s favorite “theory,” Nickerson, 1998) and consequent belief polarization (where beliefs diverge, even when people are exposed to the same evidence, e.g., Lord, Ross, & Lepper, 1979) can be explained by proposing that more information processing effort is rationally assigned to connecting new data with a theory one considers more likely to be true. This style of explanation has, as Halpern, Pass, and Seeman indicate, interesting parallels with explanations of the “biases” that arise when an agent has choices about which information to gather (rather than, as here, which information to process) (Denrell & Le Mens, 2011). For example, one explanation of the emergence of negative stereotypes for out-groups, a widely observed finding in social psychology, is that people can easily avoid out-group members after a bad experience (and hence the impact of that bad experience is never corrected). By contrast, they cannot avoid continuing to encounter large numbers of in-group members, so that any bad experiences with in-group members will be corrected. Halpern, Pass, and Seeman’s discussion suggests that we might expect similar patterns to arise where information processing is the limiting factor. Thus, even when we are unavoidably exposed to large samples of, say, arguments for and against a particular scientific, religious, or political viewpoints, people may rationally put less processing effort into understanding viewpoints they are fairly sure are incorrect. Similarly, they will rationally be less inclined to devote even a small amount of processing resources to evaluating viewpoints for which the information processing “barriers to entry” are high (because greater computational investment is required for a given “return” in terms of information gained). These factors will tend to lead rationally bounded agents to diverge in their viewpoints, depending on how they initially sample the same pool of evidence; and moreover, will lead rational agents to under-sample complex styles of explanation (perhaps leading to a bias within cognitive science itself against sophisticated mathematical, computational, or neuroscientific analysis). Furthermore, if agents differ in relevant resources, then standard “optimal foraging” style explanation would lead us to anticipate that agents with high levels of relevant cognitive resources (e.g., a relevant background in mathematics, computer science, neuroscience, or whatever it may be) will be especially drawn to such “complex” explanations (because they are under-sampled by the population at large). Thus, perhaps there is a rational foundation to the dictum that to a person with a hammer, everything looks like a nail. Lewis, Howes, and Singh (this issue) consider the relationship between rational and mechanistic explanation in the context of the methodology of constructing theories in cognitive science. They argue that theories can be derived by specifying three elements: (a) the structure of the environment, (b) the bounded machine on which the computation must be run, and (c) a utility function. The computation which provides the solution to such Optimal Program Problems is the resulting theory, which can be compared with empirical data; and in light of such comparison, the three elements of the theory can be

3

N. Chater / Topics in Cognitive Science (2014)

iteratively refined. They provide a rigourous formal analysis of this style of explanation, which can also be viewed as a refinement of John Anderson’s (1990, 1991) rational analysis methodology, which has been widely applied in many areas of cognitive science (e.g., Anderson, 1990, 1991; Chater & Oaksford, 2008; Oaksford & Chater, 1998). Trimmer and Houston also consider the interface between mechanistic and rational explanation, but from the standpoint of natural selection. They argue that natural selection operates over cognitive mechanisms in specific environments (point (1) in Lewis, Howes, and Singh’s framework)—and hence behavior may be roughly optimal in such environments. Following Gigerenzer and colleagues (e.g., Gigerenzer & Todd, 2012), they stress that natural selection may favor such “ecological rationality.” But when the agent (animal or human) is placed in an “ecology” very different from that in which natural selection operated (perhaps the experimental laboratory), those same mechanisms may perform far from optimally.

2. Combining rational and mechanistic explanation: Case studies The remaining papers can be viewed as considering the interface between rational explanation and mechanism in specific contexts. Holmes and Cohen (this issue) consider the buildup of evidence over time in simple binary perceptual decisions. They note that a mechanistically plausible model, based on a possible obstruction of neural function, the Leaky Competing Accumulator (Usher & McClelland, 2001) tends, as a limiting case, toward an optimal rational model. This optimal model matches many aspects of observed empirical data; and Holmes and Cohen show how departures from optimality in observed behavior are particularly informative concerning underlying cognitive and neural mechanisms. Hahn (this issue) stresses that apparently boundedly rational, or downright irrational, behavior may sometimes result from a misspecification of the problem faced by the agent. For example, in judging and attempting to generate “random” sequences of coin flips, people appear to be biased toward overly high rates of alternation between Heads and Tails (i.e., such alternations occur more frequently than is observed in long-run random sequences, where the probability of switching on each trial is, of course, 50%, with a fair coin). But Hahn and Warren (2009) note that sequences such as HHHH and HTHT are distributed very differently: Uniform sequences are much more likely to be clumped together (indeed, encountering a subsequence HHHH, there is only a ¼ chance that neither adjacent “moving window of size 4” will not also contain HHHH—this will occur only if both the previous and the succeeding coins fall Tails). But suppose that the key mechanistic constraint is that the brain is sensitive to small and highly local samples. Then substrings, such as HHHH, which have very “clumped” distributions are more likely not to be encountered at all—e.g., if we happen to sample the sequence of coin flips in a period “between” clumps. An alternative mechanistic explanation might focus on representation: If “runs” are encoded by total run length, then only ¼ of HHHH strings will directly be encoded as such—the rest will be instead encoded as parts of longer runs (of

4

N. Chater / Topics in Cognitive Science (2014)

5, 6, or more Heads).1 The broader point is that taking account of the specific mechanistic constraints concerning how the agent’s actual input sequences are represented and processed may provide an Optimal Program explanation of apparent biases. Finally, Dayan provides a range of elegant illustrations of how the various, and unavoidable, approximations to optimality can explain a variety of apparently irrational behaviors. In determining which actions, or more generally, sequences of actions, lead to good outcomes, various approaches are possible. Actions may be “wired in” by natural selection, avoiding problems of learning entirely (what Dayan terms the Pavlovian system); the agent can aim to learn which actions are “good” (irrespective of what goal they help achieve)—this is model-free instrumental learning; or, more ambitiously, the agent can attempt to learn the specific connection between actions and their outcomes (this is model-based instrumental learning). A Pavlovian system is computationally cheap, involves no learning data, but is inflexible. Model-free control is computationally fairly cheap and somewhat flexible, but it responds poorly to changing goals and environments. By contrast, model-based control can be fast and flexible, but it can be cripplingly computationally expensive. Dayan argues that the behavior of humans and non-human animals can neatly be explained as resulting from the interactions of these different systems. Dayan also raises one of the fundamental challenges to optimal theories of behavior—the sheer extent of choice variability (in most tasks, it would be better to stick to the “best” behavior, rather than distributing responses across a range of behaviors). He suggests that such variability may have a range of sources, from noise in input data to internal noise in the calculations of the different control systems. He notes, too, that the existence of such internal noise may provide a concrete explanation of “jumping to conclusions” (as discussed more generally by Halpern, Pass and Seeman): If internal calculations are sufficiently subject to noise, then collecting more data may be pointless. Dayan argues that paranoid patients, who jump to conclusions in experimental tasks more than controls, may do so not because of any deficit of rationality per se, but because they are adapting their behavior to their elevated levels of internal noise (Moutoussis, Bentall, El-Deredy, & Dayan, 2011).

3. Optimizing what? Optimality explanations as described in this special issue and elsewhere (e.g., Anderson, 1990; Knill & Richards, 1996) are powerful tools for gaining insights into cognition. But, while not necessarily essential to the approach, such models typically presuppose that there is an externally given criterion to optimize (Step 3 in Lewis, Howes, and Singh’s analysis). Thus, the utility function is generally viewed as exogenous—that is, determined externally by the theorist. But for many aspects of human behavior, an exogenously given utility function cannot be presupposed. People can, to a degree, choose which objectives they wish to pursue; they are not, as it were, trapped a video game, with the sole, and preset, objective of maximizing their tally of points. Rather, one of the goals of cognition is to decide which things we like and dislike and which objectives we will

5

N. Chater / Topics in Cognitive Science (2014)

pursue—and it is by no means obvious that such “tastes,” “preferences,” and “objectives” reduce to a single notion of value or reward, to which cognition should be optimized. Interestingly, economists have long since abandoned the idea that utilities are exogenously given and have instead switched to a notion of utility that emerges from the preferences of the agent. In this regard, cognitive science may have much to learn from the types of sophisticated models, and conception of rationality, that economics has generated. In particular, since the revealed preference revolution of the 1930s (Samuelson, 1938), economists and mathematicians have replaced the supposition that people are attempting to optimize any externally given criterion (e.g., some psychologically interpretable motion of utility, perhaps to be quantified in units of pleasure and pain). Rather, if economic agents are typically assumed to be subject to relatively mild consistency conditions (e.g., such as transitivity: that if they prefer A to B and B to C, then they prefer A to C and so on), it can be shown that there will exist a set of probabilities and utilities such that each agent’s choices will be just “as if” that agent were maximizing expected utility, according to those probabilities and utilities (e.g., Savage, 1954). It is not clear whether this instrumentalist view of probabilities and utilities meshes with the type of optimality analysis described here—because the predictive power of optimality explanations is in large part derived by fixing the objective and showing that behavior subserves that objective. But in typical economic explanation, the order of explanation appears to be reversed: The notion of utility itself is defined in terms of observed behavior. Now in many contexts, of course, an exogenously given utility function can reasonably be conjectured; for example, in motor control, we may reasonably take an agent’s objective to be to rapidly and accurately grasp a cup and bring it to the lips (though the trade-off between speed, accuracy, spillage, and so on may be less clear). Similarly, in foraging, it seems reasonable that an animal aims to obtain the maximal rate of caloric intake (again, the precise balance between calories obtained and energy expended, risk of predation, other nutritional factors beyond mere caloric intake, need for water, etc., may be unclear—although in many contexts of secondary importance). From an economic point of view, there is no right or wrong answer concerning how these trade-offs should be made; so long as the agent obeys the appropriate consistency conditions, the claim is just that utility can be assigned to each factor, so that the agent’s behavior can be viewed as maximizing that utility. The order of explanation in the optimality tradition is typically different—and more ambitious: to explain the trade-offs between different local objectives as themselves promoting some unified, but ultimately exogenously given, objective (ultimately grounded, perhaps, in the optimization of rate of reproduction [e.g., Singh, Lewis, Barto, & Sorg, 2010], thus embedding the approach with the adaptationist explanation in evolutionary theory [Williams, 1966]). The picture becomes more complex still if we observe that the assumptions required to “reveal” a utility function are systematically violated in human choice behavior (and perhaps also for animals). For example, it has often been argued that the empirical data indicate that preferences are not always transitive (e.g., Tversky, 1969); and the entire field of judgement and decision making provides endless illustrations of violations of a

6

N. Chater / Topics in Cognitive Science (2014)

variety of rational principles. For example, consider Shafir’s (1993) shocking demonstration that people can choose the same “extreme” option (with both very good and very bad features) in a binary choice, when asked to select the option they would like to accept or to reject. This pattern may be explicable by assuming that the distinct tasks (“accept” vs. “reject”) prompt people to focus on different reasons (positive vs. negative) leading to a different outcome. But which of these contradictory choices reflects our “true” preference? Does this question even make sense? So optimality explanation faces two challenges in domains in which the “aims” of behavior is not externally given. (a) Where a utility function has to be inferred from preferences, is the optimization of utility still a useful explanatory strategy in cognitive science? Or does the approach fall into circularity? (b) And, perhaps more urgently, given the unruly nature of actual preference judgements and observed choices, is behavior usefully modeled as optimizing a utility function at all? Irrespective of the answers to these questions in general, however, the papers in this special issue, and the wider tradition on which they are based, make it amply clear that there are many specific cognitive domains in which an externally defined measure of utility can be motivated; and where optimality explanation, paying careful attention to mechanistic constraints, provides a powerful explanatory framework.

Acknowledgments N. C. was supported by ERC Advanced Grant RATIONALITY, ESRC Grant “Network for Integrated Behavioural Science,” the Leverhulme Trust award on “Risk, time and society,” and the Templeton Foundation grant “Decision time for free will.”

Note 1. Thus, if the learner acquires an inventory of “chunks,” with frequencies, from which the sequence is presumed to be generated by concatenation, then the frequencies with which particular chunks, such as HHHH and HTHT, are extracted, rather than larger units which contain them, or small units which they contain, will depend on the details of the mechanism by which chunks are created.

References Anderson, J. R. (1990). The adaptive character of thought. Hillsdale, NJ: Lawrence Erlbaum Associates. Anderson, J. R. (1991). Is human cognition adaptive? Behavioral and Brain Sciences, 14, 471–517. Chater, N., & Oaksford, M. (Eds.) (2008). The probabilistic mind. Oxford, England: Oxford University Press. Denrell, J., & Le Mens, G. (2011). Seeking positive experiences can produce illusory correlations. Cognition, 119, 313–324.

7

N. Chater / Topics in Cognitive Science (2014)

Gigerenzer, G., & Todd, P. (2012). Ecological rationality. Oxford, England: Oxford University Press. Hahn, U., & Warren, P. A. (2009). Perceptions of randomness: Why three heads are better than four. Psychological Review, 116, 454–461. Knill, D., & Richards, W. (Eds.) (1996) Perception as Bayesian inference. Cambridge, England: Cambridge University Press. Lord, C., Ross, L., & Lepper, M. (1979). Biased assimilation and attitude polarization: The effects of prior theories on subsequently considered evidence. Journal of Personality and Social Psychology, 37(11), 2098–2109. Moutoussis, M., Bentall, R. P., El-Deredy, W., & Dayan, P. (2011). Bayesian modelling of jumping-toconclusions bias in delusional patients. Cognitive Neuropsychiatry, 16, 422–447. Nickerson, R. S. (1998). Confirmation bias: A ubiquitous phenomenon in many guises. Review of General Psychology, 2, 175–220. Oaksford, M., & Chater, N. (Eds) (1998). Rational models of cognition. Oxford, England: Oxford University Press. Peterson, C. R., & DuCharme, W. M. (1967). A primacy effect in subjective probability revision. Journal of Experimental Psychology, 73, 61–65. Samuelson, P. (1938). A note on the pure theory of consumers’ behaviour. Economica, 5, 61–71. Savage, L. J. (1954). The foundations of statistics. New York: Wiley. Shafir, E. (1993). Choosing versus rejecting: Why some options are both better and worse than others. Memory and Cognition, 21, 546–556. Singh, S., Lewis, R. L., Barto, A. G., & Sorg, J. (2010). Intrinsically motivated reinforcement learning: An evolutionary perspective. IEEE Transactions on Autonomous Mental Development, 2, 70–82. Tversky, A. (1969). Intransitivity of preferences. Psychological Review, 76, 31–48. Usher, M., & McClelland, J. L. (2001). On the time course of perceptual choice: The leaky competing accumulator model. Psychological Review, 108, 550–592. Williams, G. C. (1966). Adaptation and natural selection. Princeton, NJ: Princeton University Press.

Cognitive science as an interface between rational and mechanistic explanation.

Cognitive science views thought as computation; and computation, by its very nature, can be understood in both rational and mechanistic terms. In rati...
123KB Sizes 3 Downloads 3 Views