Innateness and Contemporary Theories of Cognition

First published Mon Oct 1, 2012; substantive revision Wed Sep 13, 2017

Nativism and Empiricism are rival approaches to questions about the origins of knowledge. Roughly speaking, Nativists hold that important elements of our understanding of the world are innate, that they are part of our initial condition, and thus do not have to be learned from experience. Empiricists deny this, claiming that all knowledge is based in experience. Different Nativist and Empiricist views spell out the details in different ways, depending on which elements of our knowledge are at issue, what counts as understanding, what is meant by the initial condition, how learning is to be understood, what it is for knowledge to be based in experience, and so on. There continues to be lively philosophical debate about whether there is any satisfactory general account of what it is for something to be innate (for a review of some recent work see Gross & Rey 2012). The Nativist views discussed here differ in many respects, but all share the broad commitments of the approach. It should be noted that the commonplace opposition of Empiricism to Rationalism reflects back on 17th and 18th century philosophical debates in which Nativism was a central plank in the Rationalist position. The contemporary Nativist views we consider here are independent of most of the broader Rationalist commitments (see the entry on rationalism vs. empiricism), but we note some important and often-ignored connections in section 3.3 of this entry. Although it is misleading, it is not uncommon for the terms ‘Nativism’ and ‘Rationalism’ to be used interchangeably (see the entry on the historical controversies surrounding innateness).

Up until the 1950s, there were no active research programs that were looking for the innate factors in knowledge and cognition that had been hypothesized and argued for by Nativist thinkers since Plato. It was widely agreed that the centuries-old battles between Empiricists and Nativists were over, and that the Empiricists had decisively won. The Nativist situation was actually worse than that: innateness claims were seen as not only wrong, but as ultimately unscientific approaches to mind and perhaps incoherent as well. The prevailing research agenda for scientists and philosophers interested in how the mind works was to show how our knowledge and abilities could be fully accounted for on the basis of our sensory experiences and the general learning mechanisms that operate on them.

But a number of developments have led to a resurgence of Nativism, beginning with Chomsky’s revolutionary work in linguistics in the 1950s and 1960s. This entry places this resurgence in its scientific and philosophical context, and will discuss a few important areas of research to give a taste of the kinds of experimental approaches, hypotheses, and theories that have been advanced. A word about the focus of this entry. Most philosophical discussions about innateness begin with careful analyses of the variety of meanings innateness claims can have, consider the sorts of entities that might be at issue in such claims (beliefs, ideas, concepts, knowledge, etc.), discuss the epistemological standing of these innate elements, and so on. These questions are no doubt interesting—and sometimes the answers are interesting too—and such work has its place. But the real action for philosophers is more in the details of the current empirical research, and less in the philosophical bookkeeping. Cognitive scientists are beginning to reveal some of the basic, or one might say ‘primal’, patterns of human cognition. They are using experimental evidence to paint a detailed picture of how we human beings understand the world—both the physical world around us, and ourselves and other selves that are parts of that world. Developmental scientists are trying to figure out to what extent and in what ways we are built by nature to arrive at these understandings. Those we identify as Nativists accord a significant role to our natures, and lean towards the view that we are not built to be initially neutral about the world we encounter, in the way that classical Empiricism would lead us to expect. This growing body of scientific thinking is of general interest, as evidenced by the attention of science magazines and newspapers like the NY Times. But the character of our primal understandings and their innate bases are intimately connected to the central concepts and questions that philosophers have always been most interested in. Getting clear on how we naturally think and how we come to think that way is, arguably, a critical element in our understanding of human beings.

The entry has three main sections. In the first section, current Nativist developments are put in recent historical context, especially the connections between Chomsky’s linguistic innovations and current cognitive science research. The second, and longest section, takes up three areas of current research on children’s early concepts and understanding—of physical objects, number, and mind/agency—to give a sense of the type of empirical work being conducted and to highlight some of the promising results that are emerging.[1] A third section reviews some recent work in the study of development that is close to the Empiricist side of the traditional divide.

1. The Chomskyan Revolution in Linguistics

1.1 The Nativist Turn

1.1.1 Behaviorism and Nativism

The reigning experimental paradigms in mid-20th century American psychology were for the most part variants of Behaviorism. B.F. Skinner’s behaviorist account of language acquisition and use (Skinner 1957) in many ways marks the end of this dominance—or at least the beginning of the end—because it was the target of a very influential attack by Chomsky (1959). This attack convinced many of the inherent limits of behaviorist theorizing (see Cowie 2010 for details).

The defining feature of Behaviorism is its anti-mentalism—the methodological claim that one can (must) provide a psychological account of human beings without referencing internal mental states. Chomsky’s attack on Skinner zeroes in on this anti-mentalism.[2] The connection between Behaviorism and Nativism, on the other hand, is typically given less prominence. Although Behaviorism is closely tied to Empiricist associationism and is therefore ‘officially’ anti-Nativist, theories like Skinner’s do incorporate significant Nativist elements. Specifically, Skinner took it for granted that every animal has a range of naturally emitted behaviors. Some of these behaviors are responses to stimuli (Skinner’s respondents—e.g., the baby’s suckling response), while others are just emitted (Skinner’s operants—e.g., the baby’s babbling). These behaviors are the raw materials that can be shaped by experience—Skinner’s conditioning and the law of effect. So the notion of an innate behavioral repertoire, and of innately specified links between environmental stimuli and elements of that repertoire, are very much part of the Behaviorist picture. This innate repertoire was, as any good Darwinian would expect, highly information rich, because it was shaped by the history of problem solving by the animal’s forebears. All parties take it for granted that babies babble, and suckle in the presence of the right stimuli, because such behaviors are part of their biological heritage. There might be disagreements about the underlying mechanisms and epistemological standing of that heritage, but it is hard to deny that humans are in some sense pre-informed that they need to suck to get milk from the breast. This is a ‘factory settings’ for babies. So if we set aside the controversy over the subject matter of psychology (behavior or the internal mind?), and the controversy over the right explanatory constructs (schedules of reinforcement or cognitive processes?), we find that Behaviorism is actually committed to innateness claims, and doctrinally opposed to any kind of ‘blank-slate-ism’. But this isn’t how things actually played out. Behaviorism was for the most part truer to its affiliations with philosophical Empiricism and Associationism, and its Nativist commitments were obscured. One important lesson is that in the Nativism-Empiricism debate we are often dealing with ideology, not theory (Pinker 2002).

1.1.2 The Chomskyan paradigm

The impact of Chomsky on linguistics and cognitive science has been much discussed. Here we briefly review some of the elements critical to the resurgence of Nativism.[3] Chomsky focused attention on two facts about human languages: (1) that they are very complex, and (2) that children come to master them without much systematic training. The second fact is fairly obvious, but the first is not. A very important step, as far as Nativism is concerned, was Chomsky’s notion of a generative grammar as a framework for articulating the complexity of a language. A generative grammar of a particular language is a system of rules that generates all (and only) the sentences of that language, along with a characterization of how each sentence sounds and what it means. Chomskyan linguistics is the project of discovering the elements and structure of such rule systems.

The link between linguistics and innateness comes in a second important move: the psychologization of grammars. Chomsky argued that every speaker of a language has a mental representation of its grammar. This sets up a natural question—how did the grammar get into the speaker’s head?—and two traditional answers immediately present themselves. The Empiricist would aim to show that the grammar (if it indeed is in the head) could be learned from experience in much the way one learns other facts about the world. The Nativist, in contrast, is ready to consider that learning a language—now reconceived as a matter of grammar acquisition—depends in some way on a language-specific innate endowment. This brings us to the third important step. Chomsky argued that a comparison of (i) the grammar that has to be acquired, and (ii) the idiosyncrasies of the acquisition process and the data presented to the language-learner, favors the Nativist approach.

So Chomsky did more than simply point to language learning as an area in which the Nativist case might be built.[4] His framework for specifying the grammatical rules that the child has to master sharpened the debate between Empiricism and Nativism in something like the way that the mathematicization of physics in the 17th century revolutionized the empirical sciences.

Part of this sharpening is the result of Chomsky’s important methodological distinction between competence and performance. Chomsky argued that a scientific approach to language needed to focus on the specific mental representations that underlie linguistic behavior (‘linguistic competence’), and not on the behavior itself (‘linguistic performance’). Linguistic performance, he argued, is scientifically intractable, because it is the result of too many idiosyncratic interacting factors. We would do better to take on the much more circumscribed question: what is the system of rules (the grammar) that generates all the allowable sentences? It soon became clear that even if we set aside the performance systems involved in real linguistic behavior, the rules of the grammar were themselves very complicated, often unintuitive, and abstract, in that they involved categories and constructs that were at a significant remove from the data. The idea that children could simply ‘pick up’ these rules by attending to what is associated with what in their language environment was just not plausible (but we will see in section 3 that this claim continues to be challenged). Yet every normal child does in fact learn a language, and so does somehow master these rules. So either the general learning system that the child wields is somehow more powerful than the Associationist-Empiricist had assumed, or the Nativist is right and there is some innate language-specific information that ‘greases the wheels’ of language acquisition. To resist the Nativist conclusion, the Empiricist has to return to the drawing board to develop a more powerful general learning theory. Chomsky developed the Nativist position and termed the innate information ‘universal grammar’ or ‘linguistic theory’. This is the essence of Chomsky’s famous Poverty of the Stimulus argument, which in an important way provided a measure of the challenge that Empiricism faces. The Empiricist-Nativist debate was no longer ‘you-say-experience-I-say-innate’ affair; it looked to many to be a matter of ‘put up or shut up’, and the burden was on the Empiricist to do the putting up.

There was significant controversy about all the elements of this paradigm shift: philosophical tangles about the notion of representation (in what sense is the grammar ‘in the head’?), technical linguistic debates about the structure and character of grammars for specific languages and about the nature of universal grammar, controversies in psychology about the relevance of Chomskyan formalisms to experimental studies of child learners and adult speakers of a language, and on and on. But the shift held. Linguistics went from a backwater to a central player (as a model and as an integrator) in the development of cognitive science as a multi-disciplinary approach to aspects of cognition and mind. Developmental psycholinguistics, a field more or less born out of these upheavals, set out to investigate experimentally whether the details about language acquisition actually supported the Chomskyan Nativist hypotheses, and in time, many developmental psychologists broke from the reigning Empiricist paradigm and began to deploy Poverty of the Stimulus arguments in other areas of cognitive development.

1.1.3 Nativism as natural science

Before Chomsky, Nativism suffered from two disabilities. The older charge, which we alluded to briefly at the start, was that the doctrine was in some way incompatible with a naturalistic or scientific approach to the world. It is true that the Nativist view, as defended by many early modern Rationalists including Descartes (1996/1641 and 1911/1647) and Leibniz (1981/1764), did contain (what we now regard as) a supernaturalist element: what was innate was presumed to have been placed in us by God. But beside this taint of anti-naturalism, there seemed to be another problem, highlighted by Locke: simplicity. Locke (1979/1690) argued that, all things being equal, we ought to prefer the simpler Empiricist doctrine, which posits only sense experience and general associationist learning, to the Nativist view, which adds inborn materials. It is this presumption in favor of Empiricism that was inherited by modern versions of Associationist psychology; it was taken for granted that if there were equally good Empiricist and Nativist accounts, the Empiricist account would be methodologically preferable on the grounds of simplicity.

In light of all this, it is important to recognize that Chomsky’s advances undercut both these supposed shortcomings of Nativism. On the first point, Chomsky repeatedly stressed that claims about internalized grammars and universal grammar were unexceptional empirical hypotheses about the internal causes of the observational evidence. The question of what is built in and what needs to be learned is a straightforward scientific question. It goes without saying that there is no hint of the supernatural in Chomsky’s linguistics: we have the innate structures we do because we are evolved biological organisms.

This Nativist connection to evolution raises a natural question: why did the resurgence of Nativism have to wait for Chomskyan linguistics; why didn’t the theory of evolution, developed more than a half-century earlier, undermine Empiricism and resurrect Nativism? The Empiricist paradigm, after all, has always promoted itself in terms of its very austere view of human knowers: we perceive the world, and learn all we know on the basis of our perceptual experience of it. But as we noted earlier, the Darwinian Revolution made it plain that as a general rule, evolutionary forces shape organisms to fit into their niche. Such shaping, at least in the animal kingdom, was obviously a matter of pre-organizing the animal’s behavior-producing machinery—the processing that goes on in its brain—so that, for example, birds know that they should eat worms and build nests out of twigs and not vice versa. No one tries to explain the bird’s competences (and birds’ natural competences extend far beyond this trivial example) purely in terms of the bird’s perceptual experience. Birds are not blank slates at birth. But we humans grow from the same evolutionary branches as the animals around us. This line of thought leaves us with a few possibilities. One is that all the innate preparedness painstakingly established in our evolutionary ancestors was somehow discarded, and we humans were redesigned—from scratch, as it were—as blank slates with a uniquely powerful learning capability to make up for our meager initial holdings. This is, arguably, the traditional Empiricist approach. Another is that we inherited a good deal of what evolution had established in the cognitive systems of the organisms from which we evolved, but that our further advance was, to a first approximation, based not on innate factors but on learning. A third view—the Nativist position—is that more was added in the course of our own evolution, and that we too are in some way pre-informed about at least some matters most critical for our survival. These possibilities are too vague to be taken as hypotheses, but the Nativist view seems at least as initially plausible as the Empiricist approach. The important point is that it should have been that plausible a century ago. Somehow the Nativist implications of evolutionary theorizing were also obscured.[5] Empiricists might argue that these implications are not relevant to the Nativist tradition that they oppose, but the point is that the issue was hardly raised. One suspects that a deep cultural and intellectual bias was at work.[6]

The upshot of this last point is that the presumed advantage of simplicity that Empiricism claimed for itself was illusory. Once we include in our measurement of simplicity how well a hypothesis fits with other established theories, the simpler hypothesis is that human beings are part of the natural biological order, and that like all other organisms they are to some degree pre-shaped by evolution to fit into their distinctive ecological niche. The naturalistic view of human beings ushered in by Darwin should have, all by itself, revived Nativism.[7] We might go a step further and ask whether Empiricism itself missed a golden opportunity to deploy evolutionary theory as a vindication of Empiricism. A more enterprising Empiricism might have noted that evolutionary theory commits us to the idea that whatever is innate in us was, at least in one sense, shaped by experience. Experience here would be ancestral experience, not the experience of the individual subject, but such a view would still ground knowledge in experience. In other words, the range of ‘learning from experience’, the Empiricist’s core commitment, would simply be extended to cover not only individual learning but species-based learning as well. But this opportunity was for the most part missed.

1.2 Exporting the Revolution: Problems and Prospects

1.2.1 The problem of linguistic exceptionalism

Although Chomskyan linguistics set the stage for a general Nativist revival, it took a while for this train to leave the station, and it will help to understand why. Part of the problem was that the original case for linguistic Nativism had been made, at least in part, by focusing on what looked to be unique features of language. Language has long been seen as exceptional; as the distinguishing feature of human cognition. Chomsky championed this view, and argued that language is central to a special kind of human creativity (Chomsky 1966).

We have already noted one facet of this exceptionalism: the fact that grammars are very complex. But there are also unexpected singularities in how children learn; in the learning process itself. Each child is exposed to an idiosyncratic sample of the language (their primary linguistic data). Each sample is compatible with any number of non-equivalent grammars that all generate the pld sample so far, but give different verdicts about new cases not in the pld. We might therefore expect (i) that the grammar a child acquires reflects the idiosyncrasies of the pld the child was exposed to, (ii) that, as a consequence, children will disagree about what is and what is not grammatical, and (iii) that adults will therefore have to correct them to smooth out errors that reflect those idiosyncrasies. But this, Chomsky argued, is not what we find (Chomsky 1965). Children learning a language somehow converge on the same grammar, as evidenced by their agreement about well-formedness, and by the distinctive types of errors they make and don’t make in the course of learning. If this is right, it suggests that the child must have prior information that somehow constrains or orders the hypothesis space that steers the child to the right grammar, and it is hard to see how this information can be acquired through experience. Furthermore, the pld contain ungrammatical and incomplete sentences, but children somehow filter out this noise, and do so without explicit instructions or feedback. There are a number of other striking features about language learning that Chomsky drew attention to: (1) it is acquired rapidly, (2) the speed of acquisition does not correlate with intelligence, (3) it does not require reinforcement or extensive explicit training, and (4) it is acquired in a critical period—a relatively fixed window in the maturation process—during which other less complex systems (counting, for instance—see below section 2.3) cannot be mastered. Each of these claims has prompted a long trail of experimentation and theory construction, and all remain controversial (see, for example, the discussion in Menn et al. 2003). But their overall effect was to single out language learning as exceptional, and perhaps unique. Chomsky himself marked this difference by speaking of language acquisition and contrasting it with learning, a term he reserved for induction-based processes.

So on Chomsky’s view, language is doubly exceptional. It is the distinctive human cognitive trait, and is essentially different from all known animal communication systems. The fact that we have it makes us exceptional as a species.[8] It is also exceptional in that the pattern of its acquisition suggests that it stands apart from all that we learn about the world; it simply grows in us. Taken together, these considerations supported a Nativist account of language learning, but tended to discourage the idea of exporting the Nativist revolution beyond language. After all, how much of the rest of the child’s untutored knowledge of the world is as complex as grammars reveal human languages to be? And how much of that knowledge comes to the child as effortlessly and without explicit instruction?

1.2.2 The expanding prospect(s)

In time, the arguments for linguistic exceptionalism gave way to a broader view of the Nativist project. Chomsky (1975) set out a fully general schema for Poverty of the Stimulus arguments that did not depend on the distinctive features of grammars and language acquisition, which had been featured in making the original Nativist case. Chomsky began to speak of language as one of possibly many mental organs that grow in the individual. This naturalistic biological model embeds Nativism about mental organs into a wider and uncontroversial biological Nativism. It is uncontroversial that kidneys do not develop as a response to the environment, and they certainly do not copy the environment. The human body is organized in such a way that in normal (fetal) environments, kidneys will form. This point could now be deployed against the Empiricist. To presume that the basic features of our physical-biological nature are internally pre-determined, but that our mental-psychological nature is not, but is wholly externally determined, is to introduce a dualism that requires a special defense. But Empiricism seems to make just this presumption, and offers no credible defense. So the tables are turned. The Nativist has been freed from the earlier supernaturalism charge, the simplicity-card of Empiricist models turns out to be spurious, and now the Empiricist seems to be the one carrying an unmotivated dualism as excess baggage.[9]

The mental organs approach has proven to be extremely influential in both philosophy and the cognitive sciences. In its most general form, it has displaced the idea of information in the mind as (for the most part) a single uniform set of sentences or data points, and put in its place an alternative architecture of systems and subsystems of knowledge and information, each, possibly, having its own design, pattern of representations, specialized function, pattern of activation, level of integration with other systems, (sometimes) specific locus in the brain, and so on. We mention here a number of developments significant to the Nativist side that that have grown out of this central theme.

The modularity of mind hypothesis. Fodor (1983) proposed a view of our overall cognitive architecture that rested on a rough distinction between input systems, or relatively rigid computational “modules” that are designed to pick up specific types of information, and more flexible central processors that integrate that information in various ways. Each of these modules has a specific task-orientation, and does its work independently of much of what is going on in the rest of the system. So, for instance, we more-or-less automatically hear sound patterns as sentences of our native language, perceive patterns of light and shadow as configurations of objects in space, and so on. In these terms, the language organ is just one of a set of freestanding mental modules. Fodor suggested a checklist of properties that such modules could be expected to have, and among them is that they are innately determined.[10] The architectural claim about modular organization does not in itself imply an innate basis, but the hypothesis that the sorts of response patterns to linguistic and visual input (like those just mentioned) have a strong innate basis is plausible and has been experimentally pursued. Fodor’s version of the view is now termed a moderate modularity thesis, because he holds that much of the business of cognition involves ‘central’ processing that is decidedly non-modular. Modules do the work of ‘presenting the world’ to highly integrated non-modular global psychological processes. But others, like Carruthers (2006), have argued that with some adjustment to Fodor’s original characterization of modules, we can argue for massive modularity.

Evolutionary Psychology. One of the controversial arguments used to defend massive modularity claims is that evolution favors this sort of architecture. This brings us to the central doctrine of Evolutionary Psychology—i.e., that cognition is best understood as a ‘Swiss army knife’ of special purpose psychological-computational mechanisms that evolved to enhance the survival of our ancestors.[11] One much-discussed example of such a mechanism is a ‘cheater detection’ module. Our ancestors needed to distinguish fair-traders from freeloaders. Those who could be consistently taken advantage of in exchanges were at a significant disadvantage in terms of survival. At some point, a mechanism evolved—a computational program in the brain, a mental organ (or mini-organ?)—that made such vigilance and record-keeping second nature, and we now all have this module as part of our innate endowment. It’s been argued—but the claim continues to be controversial—that the operation of this module explains the (purported) fact that although we fall prey to a class of reasoning mistakes, we do not make as many of these errors when our reasoning is related to cheater-detection. For Evolutionary Psychologists, the mind is a collection of evolved sub-systems adapted to the environments of our Pleistocene ancestors, not to our own environment.[12] Evolutionary Psychology is arguably the most radical Nativist-inspired paradigm, because it looks to make the range of the Empiricist’s general purpose learning mechanism smaller and smaller.

To keep the players straight, we must note that Chomsky himself has had a very complicated relationship with evolutionary explanations of mind and cognition.[13] He is certainly not a friend of Evolutionary Psychology, and has joined with its critics in questioning its adaptationist perspective.[14]

Cognitive Ethology. The modularist position, and the Nativism that fits it so well, have been supported by recent work on animal cognition, especially the discovery of very sophisticated information-rich sub-systems in the animal brain (see Andrews 2010 for a philosophy-oriented review). Early discoveries about complex animal behavior—like Von Frisch’s work on the dance of the bees (Frisch 1971)—remained in the shadows during the heyday of Behaviorism, but more and more such systems have come to light since then. Just to take navigation as an example, desert ants have an innate dead reckoning module for navigation, and various birds species have intricate innately-based systems based on the fixed stars, magnetic fields, the azimuth angle of the Sun, and so on.[15] All these cognitive modules/mechanisms are innately specified subsystems, and add plausibility to the Nativist theme that nature has built human beings in the same way.

We have explained the ways in which Chomsky’s work in linguistics inspired subsequent Nativist thinking in the cognitive sciences. But there is an irony here in that, except for the very general Poverty of the Stimulus schema (which can be traced back to Plato), linguistics and language acquisition have not served as easy-to-use templates or paradigms for developing Nativist hypotheses in other domains. We so far have no reason to think that there is any domain outside language that requires anything as complex as a grammar of a natural language to represent it. So linguistic competence remains an exceptional element in our cognitive make-up.[16] And even though some of the distinctive features of language acquisition have counterparts in other domains—sensitive and critical periods in the development of visual perception, for example—there does seem to be something exceptional about the way virtually every normal child comes to master a language. We might say that for Nativists, language has been more an inspiration than a working model. But at the same time, as Nativists move beyond language, they may avoid many of the methodological challenges to the Chomskyan approach (including: is a grammar a theory of competence, in what sense are grammars ‘mentally represented’, is the pld all that’s relevant to acquisition, etc).

2. Empirical Findings and Theories

A full account—even a comprehensive survey—of Nativism in the cognitive sciences is beyond the scope of this entry. But there are a number of conceptual domains that have been especially well investigated by cognitive scientists in the last decades, and this section will highlight a few areas that are the subject of lively and theoretically interesting work, and that are connected to traditional and contemporary philosophical concerns.

2.1 Background: the Piagetian Paradigm and the Core Cognition Hypothesis

The research we will discuss in this section is inspired by the Chomskyan paradigm, but there is an important difference between the language case and this developmental work. Chomsky’s Linguistic Nativism used Skinner’s Behaviorism as a foil, but the Behaviorist paradigm was not the reigning scientific paradigm in the area of child development. In this field, the Swiss psychologist Jean Piaget was the dominant figure, and his research has served as the backdrop for most developmental work over the last 40 or 50 years.[17]

Piaget generally ignored Behaviorism, and conducted experimental studies on the child’s evolving conception of the world. His extensive research agenda included the child’s understanding of space, time, God, objects, causality, morality, dreams, number, being alive, and more. Piaget’s specific questions and experimental results—which were reciprocally (mostly) ignored by Behaviorists—have served as a jumping off point for many Nativist-oriented theorists. But Piaget was not a Nativist. The heart of the Piagetian paradigm is his stage theory. On this view, children start with a very different conception of the world than adults have—in fact, Piaget thought that they start without a conception of an external world at all—and they go through a series of identifiable stages that culminate in adult understanding. The powerful unifying idea here is that there is something about the general character of these stages that is the same across all domains of understanding, and that the dynamics of stage transition is also uniform. To a first approximation, for Piagetians there are no significant distinctions between the developmental patterns in different domains of understanding. If we consider any domain, the stage theory imposes a uniform grid of steps in the development of that domain knowledge. The dynamic picture, again very roughly, is that a child at a stage proceeds until she faces an insurmountable obstacle; her present grasp of things makes it impossible for her to deal with a recalcitrant problem. This disequilibrium propels her to the next (pre-plotted) stage, in which new internal resources become available—an enriched conception of the world or a new flexibility in physical interaction—and the earlier problem can be resolved. The child recovers equilibrium until coping with problems again causes a crisis that leads to the availability of more new resources, and so on. The articulation of the Piagetian paradigm involved understanding the general nature of these stage-transitions better, exploring how the stage theory operates in specific domains, and understanding the new cognitive and behavioral resources that make these transitions possible.

Philosophers will recognize the theory as in some ways analogous to the theories of scientific development proposed by Thomas Kuhn (1962/1996) and others. Two important differences are worth mentioning, because they highlight what is distinctive about Piaget’s approach. First, although science develops organically, there is, for Kuhn, no one specific resource that applies across all fields. What explains the shift from one dominant paradigm to another in economics will typically not explain the shift from the Ptolemaic to the Copernican paradigm in astronomy. But Piaget held that what makes it possible for the child to advance in her understanding of space is in one sense the same thing as what facilitates the stage transitions in the child’s developing understanding of God or morality. Second, science depends on the contingent, uniquely fruitful innovations that overthrow older understandings and set the stage for new ones. But in children, Piaget’s developmental stages are posited as mandatory; we might say they are innately prescribed steps in normal development. The child’s forward motion is regularized as the world presents its predictable problems, and the new resources become available to solve them and advance the child’s understanding. The upshot is that although Piagetians produced probing and highly detailed studies of various domains of the child’s understanding, they shared the Empiricist preference for an across-the-board domain-general mechanism that could explain the developmental facts in every domain. Although there are interesting ideas inherent in the Piagetian paradigm about the innate endowment that makes adult cognition possible, it is not easy to place Piagetian Constructivism on the Nativist-Empiricist spectrum.[18]

Piaget’s theories provided the scientific received view against which developmentalists inspired by Chomsky’s linguistics reacted. These researchers set aside Piaget’s assumption that development is uniform across domains, and instead—in part inspired by Chomsky’s organology and modularity claims—considered each domain independently. The overall strategy was to discover the cognitive capacities of the youngest children, and to develop and test hypotheses about (i) the initial state, and (ii) the transitions that move the child from the initial state to the normal adult repertoire.

The ‘Core Cognition’ hypothesis. Many developmentalists in this camp share a commitment to the ‘Core Cognition’ (sometimes called ‘Core Knowledge’) hypothesis (Carey 2009; Carey & Spelke 1996; Spelke et al. 1992; Spelke 1998, 2000, 2003). According to this hypothesis, evolution has equipped our species (and other species too) with an innate repertoire of conceptual representation types, that is, representations that cannot be reduced to the perceptual primitives favored by the Empiricists or the sensory-motor primitives favored by Piagetians. Rather, evolution has shaped our perceptual input analyzers to detect certain types of entities in the world, and has provided us with principles—embodied in our cognitive machinery—that determine how we (at least initially) think about such entities. These different types of entities are few in number. To date, there is a consensus among proponents of this hypothesis that the innately specified core domains include physical objects, number, and minds.[19] Proponents of the Core Cognition view defend a moderate Nativism; they leave work for learning mechanisms, which, together with maturation, take the infant from limited ‘core’ conceptual systems to the broad and highly elaborated knowledge of the world that adults have. In some cases, adult knowledge extends the core; in others it ‘over-writes’ it. The conceptual machinery that embodies a core domain is often referred to as an ‘intuitive theory’—for instance, a folk physics or folk psychology (sometimes theory of mind)—to highlight the fact that each supports patterns of conceptualization of input and inference. There is intense ongoing work on core domains, and research paradigms are being extended to non-human animals and across cultures. In the sections that follow, we review select findings on three domains: physical objects, number, and intentional agents.[20] We concentrate on very early development. While it is often difficult to say what exactly the research reveals about the young child’s knowledge (for methodological as well as philosophical reasons), the earlier some distinctive elements of a competence are present, the less likely that it was learned solely on the basis of experience.

Methodological innovation: the ‘violation-of-expectancy’ looking time. The work we discuss depended on solving a knotty methodological problem: how to discover what is going on in the minds of preverbal infants and very young children? Though infants cannot report on what they are perceiving or thinking, one can make inferences from their reactions to objects and events. Long before they utter their first words, they suck, grasp, creep, crawl, and—most importantly—they look. Since infants, like adults and other animals, look longer at an unexpected stimulus, where they look and for how long they look can reveal a good deal about their expectations about the world. While measures of grasping, crawling, and sucking have all been successfully used to reveal some of what is going on in the baby’s mind, the measure that has been used most extensively is the violation-of-expectancy looking time (sometimes called preferential looking time). Experiments using this measure tend to have a similar structure: during an initial phase, the child is presented with display \(X\), over and over, until the child’s interest wanes and looking time drops down to some criterion (the habituation phase). In the test phase, the child is presented with two displays: \(Y\) and \(Z\). If the child reliably looks longer at \(Y\) than at \(Z\), this provides evidence that \(Z\) is as expected, but that \(Y\) is unexpected.

2.2 Physical Objects

As adults, we recognize physical objects as bounded entities that persist through space and time; they ‘hold together’ as units, and their paths, when they move, are continuous. In addition, objects causally interact upon contact with each other. Do we learn these properties of objects by experience, and if so, by what sort of experience? Empiricist thinkers have argued that these properties are learned, and have proposed several different types of experience as requisite input to such learning. Helmholtz (1867/1962) suggested that moving around objects and manipulating them were necessary for building a concept of an object. Quine (1960) looked to language as the relevant source of information, and Piaget (1954) proposed that sensorimotor coordinations led to construction of the concept of a physical object. Indeed, Piaget famously argued that infants altogether lack object permanence (1977), the understanding that objects persist in time and space, until the latter half of the second year of life.

2.2.1 Folk physics

Object permanence. In the last 35 years, the baby’s representation of objects has been re-explored with striking results. A landmark study (Baillargeon et al. 1985) used the violation-of-expectancy paradigm to test the Piagetian claim that infants lack object permanence. Five-month-old infants were shown a screen that rotated 180 degrees up from the surface of a table and back again to its initial position. In the habituation phase, the babies got used to the screen motion and their looking time decreased, evidence that they no longer found the screen’s movement to be novel. In the test phase, an object was placed in the path of the screen as the screen moved downward to the table’s surface. In one outcome, the screen rotated down until it touched the object and then rotated back up to its initial position, an event that adults recognize as possible. In the other outcome, the screen continued its downward trajectory to the table, at first hiding the object and then apparently moving right through the space occupied by the object, an event that adults recognize as physically impossible. The logic here is straightforward: babies will see the second outcome as surprising only if (i) they represent the object as continuing to exist even when it can no longer be seen behind the screen, and (ii) they assume that two objects cannot occupy the same space at the same time. Only then should they look longer at what adults recognize as an impossible event. If, however, young infants lack object permanence or have no constraints about two physical objects occupying the same space, the impossible event will not constitute a violation of any expectation. The results were clear; babies looked longer at the impossible event, indicating that it violated their expectation of objects. The same finding was later demonstrated with 4-month-olds (Baillargeon 1987). These findings offer evidence that very young infants represent objects as persisting even when they are no longer in view, an understanding of object permanence thoroughly at odds with the claims of Piaget and Quine. One may still ask what exactly the child knows or represents (Burge 2010 is especially pertinent here), but the point is that there is something in the child’s cognitive apparatus that is sufficient to generate this expectation, and the burden of explanation is on the view that this is learned from experience. Moreover, these infants also expect that two objects will not occupy the same space at the same time.

Spatiotemporal continuity of objects. As adults, we know that objects are spatiotemporally continuous; an object that appears at point \(A\) and then at point \(B\) must have traversed a continuous path between these points. Here too the violation of expectancy looking time paradigm has been used to test the Empiricist claim that such knowledge requires an extended learning period. In one study (Spelke et al. 1995), 4.5-month-olds were shown a stage with 2 screens on it, with a visible gap between the screens. In the discontinuous motion condition, each screen has an object hidden behind it. First, the object behind the left screen is moved further left so that the baby sees it, then it is moved back behind that same screen. The object behind the right screen is shown in the same way, so that during these displays only one object has been visible at a time and no object has ever been shown to cross the gap between the two screens. Adults seeing this display infer that there are 2 objects involved. To find out if babies make the same inference, the screens are removed and the infant is shown either one object or two objects. The result is that infants look longer at the one-object display, presumably expecting, like adults, that there had to be two objects; otherwise, the object would have been visible crossing the gap. In a follow-up study, a continuous motion condition was used. This condition is identical to the previous condition except that, between the alternating trials, an object is seen crossing the gap. In this condition babies looked longer at the display of two objects. Like adults, they presumably assumed there was a single object moving back and forth (Aguiar & Baillargeon 1999 report similar findings with 2-month-olds). Generally, by 2-months, and perhaps earlier, infants expect that objects persist through time, move continuously, have parts that cohere, and are solid (Spelke 1990). Recent work (Rips & Hespos 2015) shows that by 6-months, babies already have different expectations for rigid bodies, soft objects, and liquids.

2.2.2 Animals and the representation of objects

If the representation of spatiotemporally continuous objects is part of our evolutionary endowment, we might expect to find such representation in the newborn of other species, and indeed we do. Newborn chicks, for example, display a striking ability to represent spatiotemporally continuous objects (see Spelke 1998 for review). In one study, newborn chicks spent their first day of life in a homogeneous environment containing only one inanimate object. On their second day, the object was moved fully out of view behind one of two screens. Though they had never before seen an object hidden behind another, they reliably searched behind the correct screen where the object was hidden. Indeed they even did so when they had to turn away from the object in order to reach it (Regolin et al. 1995; Regolin et al. 2000). Chicks, it seems, have object permanence from birth. Although this does not show that object permanence is innate in humans, it does show that in at least one animal, evolution has succeeded in building it in. So Nativists can claim an existence proof of an innately endowed representation of objects as permanent.

2.2.3 Babies’ representations of objects support addition and subtraction

Wynn (1992) showed that young babies represent objects as not only persisting in time and space, but also as subject to addition and subtraction. In that study, babies were habituated to a display of a single object on a stage. Then a screen came up and hid the object completely. Now a hand was seen to bring in another (identical) object and move behind the screen, from which the hand then withdrew empty. The question was: do the babies now represent 2 objects behind the screen? Test displays consisted of 1 object, 2 objects, and 3 objects. Again, in line with Nativist claims, babies showed longer looking times to all displays except the 2-object display. In this respect, babies showed the same expectations that adults do. Further testing showed that babies are not only capable of ‘adding up’ the number of hidden objects (at least to 3), but are also capable of ‘subtraction’ of the same number of hidden objects as well. This finding has been replicated in 4- and 5-month-olds as well (Simon et al. 1995; Koechlin et al. 1998).

It may be tempting to see the infant’s ability to add and subtract the number of objects in a display as evidence that infants already have something close to the adult concept of number, but a series of studies suggests that this is not the case. Most telling is the extremely limited set size (about 3 objects) over which the baby can add or subtract. To illustrate this set size limit, which has emerged in a variety of experiments, consider the following study that used crawling, rather than violation-of-expectancy, as an indicator of the baby’s representation (Feigenson & Carey 2005). In this study, babies watched as graham crackers were placed, one at time, into 2 separate boxes. The babies were then allowed to crawl to the box of their choice and retrieve the crackers. When one box had 1 cracker and the other box had 2, babies crawled to the box with 2. Similarly, when one box had 3 crackers and the other had 2 or had 1, they crawled to the box with 3. The surprising finding, however, is that babies failed with 4 versus 3, 4 versus 2, and even 4 versus 1. Apparently, the ability to represent and keep track of exactly 4 objects is beyond the baby’s capability. Given the set size limit of 3 objects, it is arguable that the baby’s competence should be understood as an ability to track 3 different objects in working memory. One might argue that the baby can succeed in adding and subtracting very small numbers of objects without having a general concept of number or any general numerical competence. We return to this issue in section 2.3.

2.2.4 Intermodal representation

As mentioned above, Piaget (1954) proposed that sensorimotor coordinations gradually lead to construction of the object concept. Establishing these coordinations between different modes of perceptual experience—vision and touch, for example—would take time, and Piaget proposes that it is not until the child is 18–24 months that these coordinations have been constructed. Meltzoff & Moore 1977 provide counter-evidence to this claim. This study shows that newborn infants can imitate the facial movements of an experimenter, clearly revealing the coordination of their own movements (along with the attendant feelings of their muscles) and their visual perception of the experimenter’s facial movements. The following video links (Ferrari et al. 2006, videos S1 and S2) provide some evidence that newborn rhesus macaques have this ability as well. Insofar as these coordinations of different modes of perceptual experience are present at birth in humans and in monkeys, they simply could not be the products of learning.

2.3 Number

2.3.1 The analog number system

There is currently a great deal of empirical research—and philosophically sophisticated debate[21]—on the underpinnings of numerical knowledge in adults and children. There is strong evidence for the view that in addition to an exact number system that underlies formal mathematical thinking, adults also have an analog magnitude system for representing approximate number (see Dehaene 1997). For example, if we are very briefly shown 2 bowls of rice, one with 20 grains and one with 50, we can tell immediately which has more, even though we couldn’t say exactly how many grains of rice were in either. Similarly, shown a book with 70 pages and a book with 100 pages, we can see instantly which has more pages—though again without knowing the exact number of pages in either book. Controlled tests show that such judgments are independent of variables that correlate with magnitude, such as the extent of space occupied or the size of the individual stimuli. The ‘signature’ of this analog magnitude system is its ratio dependence: that is, the difficulty in comparing two analog magnitudes decreases as the ratio difference between them grows. Recent studies indicate that the smallest ratio difference needed for adults to successfully discriminate 2 different analog magnitudes is 8:7. If the ratio is smaller, error rates in comparing magnitudes spike up.

Analog magnitude representations in infants. Recent studies have shown have shown that 6-month-olds use this same analog system to discriminate numerical arrays (McCrink & Wynn 2004b; Xu & Spelke 2000). In one study (Xu & Spelke 2000), infants were habituated to displays of either 8 dots or 16 dots. When shown novel dot displays, babies who had originally seen 16 dots dishabituated to displays of 8, but remained habituated to the new displays of 16. Similarly, babies who had been habituated to 8 dots dishabituated to novel displays of 16, but not novel displays of 8 dots. (Again, researchers controlled for the cumulative amount of space occupied by the dots, the density of the dots, and the size of the dots.) A series of studies have now shown that 6-month-old babies can use the analog magnitude system successfully so long as the magnitudes differ by a 2:1 ratio. When presented with dots displays that have a 3:2 ratio, such as 24 to 16, babies this age do not show discrimination. Note that the necessary ratio for discrimination gets smaller with age, so that 9-month-olds succeed when the magnitudes differ by 3 to 2. One might have thought that this ability to discriminate approximate quantities is somehow implemented in the visual system, but the analog number system has been shown to operate at a more abstract level (or perhaps to be implemented in a number of perceptual modalities). At any given age, the same ratio applies no matter whether the stimulus is a number of dots in a spatial array or the number of tones in an auditory sequence (Lipton & Spelke, 2003)—or even a number of events (jumps) in a visual display (Wood & Spelke, 2005).

Not only do these representations support comparisons of magnitude, they have also been shown to support approximate addition and subtraction in babies as young as 9-months-old. In one study, babies were presented with a set of 5 objects that moved behind a screen so they were no longer visible. Then another set of 5 objects was presented and they too moved behind the screen. When the screen was removed, babies looked longer if there were only 5 objects than if there were 10. In a parallel subtraction condition, where babies first saw 10 objects move behind the screen, and then saw 5 objects taken away, they stared longer when the screen was removed to show a display of 10 objects (McCrink & Wynn, 2004a).

The approximative analog system we have been discussing is different from the object-tracking system mentioned earlier (for instance, in the graham cracker study). The infant’s object-tracking system has a severely limited set size, and this is true of the adult’s tracking system as well. The analog magnitude system does not. It has also been found that infants’ success shows the ratio-dependence profile of the analog magnitude system. The fact that 6-month-old babies appear to use the same system of analog representation that adults do—although, again, their discriminations are less fine—strongly suggests that humans come equipped with an innate system that makes it possible for them to make relative size distinctions across modalities. Very recently this hypothesis has been given strong confirmation by a study showing that the analog magnitude system operates in newborn babies (Izard et al. 2009). In this study, newborns were familiarized with auditory sequences containing a fixed number of syllables and were then tested with visual-spatial images of the same or a different number of objects. Infants spontaneously associated stationary, visual-spatial arrays of 4–18 objects with auditory sequences (spoken syllables) on the basis of approximate number, providing evidence for abstract numerical representations at the very beginning of postnatal experience.[22]

Is the analog system species universal? If this analog numerical system is innate, it should be found in all human societies, no matter how urban or rural, educated or unschooled, whether in technologically advanced societies or remote and isolated tribal villages in other parts of the world.[23] If it is the same system evident in the youngest infants, it should not require exposure to any symbolic representations of number, such as Arabic numerals or a number lexicon. To test this hypothesis, investigators explored the analog number system in the Amazonian Munduruku people, an isolated tribe whose language has no words for numbers greater than 5. As predicted, the Munduruku compared and added large approximate numbers far beyond their naming range. Moreover, performance decreased as the ratios decreased, just as it did in a group of French control subjects (Dehaene et al. 2008).

Animal representation of approximate quantities. If this numerical system (what Dehaene has called our number sense) is part of our innate endowment, might it be evident in other primate species? Hauser and his colleagues (Hauser et al. 2003) presented cotton-top tamarins with auditory sequences of syllables of different numerosities. Like humans, monkeys orient their attention to unexpected stimuli. When they hear a sequence of syllables of an unexpected numerosity, they turn their heads toward the audio speaker from which the sounds are emanating, providing a reliable indicator of their discrimination of the novel number. The results are similar to those of the infant studies: cotton-top tamarins discriminated between sequences of syllables based on approximate numerosity alone. Moreover, discriminability depended on the ratio of the numbers, just as it does in humans. Indeed, adult tamarins showed comparable discrimination abilities to nine-month-old human babies.

There is now a sizeable literature showing the presence of analog magnitude representations in many different kinds of animals, including rats, crows, pigeons, a parrot, rhesus macaques, apes, and dolphins (see Carey 2009 for review). In short, there appears to be excellent evidence from studies of human adults, human babies, and animals, all suggesting the presence of an ancient evolutionary system of approximate number representation.

2.3.2 The analog system, the tracking system, and the concept of number

If one steps back from the theoretical heat of the Empiricist-Nativist debates, it should not be surprising that we have an innate system for discriminating sets by their approximate size, and that this system is found in other animals too. Animals typically need to take some measure, for example, of the relative size of food sources, of the relative number of predators on their left and right flanks, and so on. In some animals, these abilities may be part of an encapsulated system devoted to a specific task. The bee’s awareness of the relative size of a discovered food source—information communicated in the scout’s dance—is a popular example of this sort of ability (Frisch 1953). In other animals, the system operates more broadly, and different sorts of inputs can be measured in this way (heard sounds, perceived jumps, and so on). It is as if the brain has an ‘accumulator’; a bar graph system of some kind that maps input arrays into some neutral format and appends the elements together into a stack, and a scanner that judges relative stack size. Gelman and Gallistel and others have explored such systems extensively (starting with Gelman & Gallistel 1978).

Equally unremarkable is the fact that crawling infants distinguish 1 cracker from 2 and 2 from 3. This discrimination is beyond the abilities of the posited analog approximative systems. But it suggests that there is another system in the child that is in a limited way sensitive to number. This system—which seems more tied to attention—is the subject of current research (see, for instance, Pylyshyn 2007). Animals need to keep track of changing elements in their immediate environment. One idea under investigation is that there is a psychological subsystem system that ‘tags’ elements in a perceptual array and keeps track of them by assigning properties to the tag. Without such a system, we would lack the ability to re-identify changing elements from one moment to the next. Such a system therefore seems to be a prerequisite for any perception of a world/scene, as involving things that are moving and changing. It is difficult to see how an animal that does not track in this way could learn to do so (although the ability might grow).

If current thinking about these systems is on the right track, we have two innate systems, each of which deals with number in some sense. The analog system takes a range of perceptual presentations and assigns an ordering by relative magnitude. The second system identifies and tracks (a limited number of) discrete elements in an environment. A current research question that is of particular interest to philosophers is this: what is the relation of these innate systems to the adult concept of number. Notice that the analog system does not get us to the concept of an exact number. Only ranges are detected—the system can judge that two sets are in the same range (below the ratio-threshold for discrimination), but two arrays that are “the same” in this way need not have the same number of elements. The second system is not approximative. If the subject has tracked 2 objects and 1 is added, as in Wynn’s studies, the difference is noted and the subject’s expectations change accordingly. So this object-tracking system is sensitive to the number of units in play, and in this respect is closer to the adult notion of number. But it has an extremely constricted range, and is useless when it comes to problems that extend beyond its range. The crawling infant in the study cited earlier doesn’t represent an added 4th cracker as one more than the 3 previously tracked. The infant doesn’t even track the 4th element: the system seems to (eccentrically) shut down completely when its range is exceeded. So the concept of number—the successor function and all that it brings in its wake—is not implemented in this system.

For these reasons, some have argued that aside from these well-evidenced systems, there must be a third element in the human mind—viz., an innate concept of number, which must involve the grasp of a fully-general successor function—that grounds adult mathematical competence (see Leslie, Gallistel & Gelman 2007, for example). Others, like Carey (2009) have argued that the concept of exact number is not innate, but is constructed by the kind of language-based bootstrapping sketched out by Quine (1960). The debate here is especially interesting because although both sides are Nativist—in that both accept innate ‘numerical’ systems—there are still learning elements in play in the search for an adequate psychological account of our distinctive arithmetic competence.

The simple question: “Is number innate?” turns out to be too simple. However the current debates play out, we can expect that the achievement of adult number competence is quite complex and involves significant innate and learned elements. We should be prepared to find that things are no less complicated on other Empiricist-Nativist battlegrounds.

2.4 Mind and Intentional Agency

2.4.1 The Theory of Mind experimental paradigm

In a seminal paper of 1978, Premack and Woodruff posed the question of whether chimpanzees have a ‘theory of mind’; that is, do they attribute mental states to others, and do they, like adult humans, predict and explain action on the basis of hypotheses about these states. It was a mark of Piaget’s influence that no one had as yet asked this question in regard to human infants; Piaget thought that they did not yet have a robust notion of an external world at all, let alone of a world containing minds. The chimp studies led to an explosion of research into the development of a theory of mind in human beings. In responses to Premack and Woodruff’s paper, Dennett and others commented that the successful prediction of another’s action does not yet constitute evidence for a theory of mind. Consider the following: a child participant in a study is told a story about a boy named Max who has a piece of candy. Max puts it into the red cabinet and goes out to play in the yard. The child participant is asked, “When Max returns and wants to get his candy, where will he look for it?” The child might answer correctly because he or she understands that Max will think that the candy is where he left it or last saw it (i.e., the red cabinet). This involves attributing mental states to Max. But the child might also answer correctly because that’s where the candy actually is. That is, the child with no theory of mind might still answer correctly simply by reasoning that people go to get things where they are. The way to resolve this uncertainty, Dennett proposed, was the false belief task. In this task, which quickly became the litmus test of a theory of mind, the story includes a second character who enters the scene while Max is still outside in the yard. This second character finds the candy in the red cabinet and puts it into the yellow cabinet. Once again, the child participant—who has seen the transfer—is asked where Max will look for his candy when he returns to the kitchen. Only if the child is successful now, responding that Max will look in the red cabinet—even though the child knows that the candy is really in the yellow cabinet—can we legitimately attribute to the child a theory of mind. There is now a very large literature involving the false belief task and the bottom line appears to be that most young 3-year-olds incorrectly predict that Max will look in the yellow cabinet (or, in some studies, say that Max thinks it’s in the yellow cabinet) because that’s where it is, while somewhere between the ages of 3.5 and 4, children begin to succeed on the task.[24]

For two decades, success on the false belief task was considered the only really hard evidence for a claim that one had a theory of mind. Whatever social competence children showed before passing the false belief test was widely considered a precursor to having a theory of mind. More recently, however, cognitive developmentalists have argued that success on the false belief task is neither necessary nor sufficient for the attribution of a theory of mind, and that focusing nearly exclusively on it has led to an overly narrow view of the conceptual domain (Bloom & German 2000). The last several years have seen a plethora of studies investigating the attribution of mental states, and social cognition more broadly, in infants. The next section focuses on a group of key concepts involved in understanding minds including goals, agency, and rationality.

2.4.2 Goals

Woodward’s 1998 study on goal understanding in 6-month-olds is a good example of the pattern of recent work in this area. Infants watched a hand move across a stage and repeatedly grasp one of the two objects on opposite sides of the stage. The hand always moved along the same path to the same side of the stage and then always grasped the same object. After the infants habituated to this display, Woodward switched the location of the two objects. Now one of two events occurred: either the hand took a different path to grasp the same object it had always grasped (that object now being on the other side of the stage) or it took the same path as before, but now grasped the other object. Looking time showed that infants were more surprised when the hand followed the same path and grasped the other object than when it followed a new path and grasped the originally grasped object. This would make sense if the infants understood in some sense that the previously grasped object was the hand’s preferred goal. To see if this was really the basis of the babies’ looking responses, control conditions were included to rule out a variety of other possibilities.

In a control condition, the hand was replaced with a rod that had a multi-fingered sponge at the end. When the rod/sponge followed its old path and touched the new object, babies did not dishabituate; they dishabituated only when the rod/sponge followed a new path to the old object. The suggestion is that the babies did not see the action of the rod/sponge (whose shape was similar to the shape of the hand) as a goal-directed action. What is it about the presence of the human arm that signals a goal? Would any movement involving repeated contact between a human hand and one of the toys trigger goal attribution? Woodward (1999) shows that this is not the case. In this study, a human arm was used again, but this time the arm merely dropped onto the display, and contact was between the back of the hand and the toy. In this case, there was contact, but not grasping. In this condition, adults would be less likely to interpret the action as purposeful, and the same was true of the babies. When the hand/arm followed its earlier path (touching the new object), babies did not dishabituate; they did however dishabituate when it followed the new path, even though it made contact with the same object as before. This suggests that 5-month old babies, like adults, attribute goal-directedness (again: ‘in some sense’) to human arms and hands that reach and grasp, but not to arms that only drop and make passive contact with the object.

What clues do babies use to determine if a perceived motion is goal-directed? The previous study suggests that they are finely tuned to complex patterns of self-directed bodily activity. One might hypothesize that babies first restrict their attributions of goals to humans only and then, with experience, extend the range to include non-humans as well (Woodward 2005; Meltzoff 2005). But a recent study, however, suggests that this may not be so. In this study (Luo & Baillargeon 2005), babies reliably attributed goals to a moving box, which they were previously shown could move on its own. The key difference between the rod/sponge in Woodward’s study and the moving box in this study appears to be information about autonomous motion. The rod/sponge never showed such capacity; the moving box did. Autonomous motion, the authors argue, signals an object’s status as an agent, and agents, for the baby, have goals. These results have recently been extended to 3-month-olds (Luo 2011).

A very recent study has shown that infants are sensitive not only to clues indicating an agent’s capacity for autonomous motion, but to the perceptual information available to the agent as well and to the agent’s preferences. Remarkably, this is true even when this information differs from their own. In Luo and Johnson (2009), 6-month-old babies saw another person look at 2 different objects and repeatedly reach for the same one. As indicated by their looking times, babies in this condition attributed to the other person a preference for the chosen object. In contrast, in a condition where the baby saw 2 objects, but also saw that the other person could see only one, no preference was attributed. In this case, it seems, the baby appreciates that the other person cannot see the second object and that therefore the repeated grasping of the first object does not indicate a preference. This suggests that babies at this age can already attribute different perceptual information to different perceivers (what I see vs. what she sees). Nativists expect to find similar sorts of perceptual preparedness for other systems of knowledge and action (for instance, a system of face recognition as preparedness for social and family life).

The cognitive resources we bring to bear on the problem of responding to and carrying out goal-directed behavior is complicated; these studies provide evidence that some of these resources are in place very early in life. They do not show that the infant’s goal-directedness abilities are innate; they might somehow be learned on the basis of early experience. But again, such findings shift the burden. The earlier that resources involving notions like intention, goal, preference, and so on appear, the greater the challenge to Empiricist claims that the categories are learned solely on the basis of prior experience.

2.4.3 Agency, cooperation, and the beginnings of moral cognition

Another set of studies (Kuhlmeier et al. 2003; Hamlin et al. 2007) provide evidence that infants are not only sensitive to displays of agency, but also have a sense of (something like) cooperative behavior: they readily distinguish between helpers and hinderers. In the 2007 study, babies were shown animated displays that adults interpret as a red circle trying to climb a hill but having trouble making it all the way up (Hamlin et al. 2007 video display on line). In half the trials, the babies see a yellow triangle gently ‘helping’ the circle up the hill; in the other half, they see a blue square gently pushing the triangle down to the bottom. Adults plainly see the yellow triangle as a helper, an agent whose goal is to assist the circle in getting up the hill; they see the blue square as a hinderer, an agent whose goal is to stop the triangle from getting up the hill. Babies make such a distinction as well. Six-months-olds showed surprise in test trials that came after the hindering and helping scenarios, in which the red circle is seen approaching its hinderer rather than its helper. Furthermore, in a live action version of the task, the 3-month-old babies themselves chose to touch the helper more than the hinderer when they were given both to choose.[25]

Follow-up studies suggest that infants’ understanding of social interactions goes deeper still. Hamlin 2015 reviews a series of findings that show that their preference for helpers is surprisingly nuanced. First, they are not simple helper-lovers. By the time they are 5-months-old, they prefer those who hinder hinderers to those who help them. (As Hamlin 2015 points out, there is very little, if anything, in the babies’ experience that supports such a pro-hindering attitude.) Second, their preferences are not just personal; they rise to the second order. They prefer other actors who help helpers and hinder hinderers to those who do the opposite. Third, by the time they are 10-months-old, they are able to factor in information about actor \(A\)’s preferences, and about actor \(B\)’s knowledge of \(A\)’s preferences, in determining whether \(B\)’s action should be classified as a case of helping or hindering (or as neutral). So, for example, if the baby sees \(A\) express a preference for toy\(_1\) over toy\(_2\) (by repeatedly choosing toy\(_1\) when given the option), and the baby also sees that \(B\) sees \(A\)’s pattern of choosing, then the baby will count \(B\)’s giving toy\(_1\) as a helping behavior, but not toy\(_2\). Fourth, and perhaps most surprising, by 8-months, babies do not judge actors on the basis of outcomes, but rather intentions. An unsuccessful helper is preferred over a hinderer and (neutral bystanders) at the same rate as a successful helper; the same holds for unsuccessful hinderers, mutatis mutandis.

It would seem, then, that sometime between 3 and 9 months, babies are arguably already on their way to a concept of desert. Much remains to be discovered about the contours of their concept and its subsequent development. As Hamlin points out, apart from the helper-hinderer contrast, we don’t know how they determine who deserves what. Relevant here is the finding that babies prefer not only helpers, but also those who are relevantly similar, who like the same toy or candy, for example (Mahajan and Wynn 2012). How much of their apportionment of desert is dependent on such factors as opposed to factors that adults might consider morally relevant, like fairness, responding to need, and so on. How, if at all, does their early concept connect to egalitarian notion of fairness?

If we consider morality as a system that evolved to enhance cooperation within large groups of unrelated individuals (Bloom 2013, Joyce 2006), we see that infants have some of the key prerequisites in place for such a system: (1) they have a positive attitude towards cooperation, (2) they have some grasp of other actors’ preferences and their informational point of view, (3) they are sensitive to actors’ intentions as embodied in their actions, and (4) they are ready to enforce rules by punishing violators and rewarding adherers, and they approve others who do likewise.

We are moving towards a better understanding of the early cognitive and motivational underpinnings of moral norms, understood as social rules or expectations that all are expected to obey and enforce. As we noted, there is still much to learn here about both the early state and what is likely to be a very complicated developmental story about maturation and the influence of the child’s social environment. (Bloom and Wynn 2016 provides a useful and philosophically informed summary of the sate of research on which features of our moral cognition are, or are not, part of an early core.) But there is mounting evidence that from early in their first year infants are social cognizers with (at least) a hold on the moral realm. It is hard to see any way that all of this can be learned from experience (Hamlin 2015).

2.4.4 Rationality

Our understanding of goal-directed behavior is characterized by a principle of rationality; that is, that all things being equal, agents take the easiest, most direct, and most efficient means available to achieve their goal. In a series of studies, Csibra, Gergely and their colleagues provide evidence that infants use this principle (Csibra et al. 1999 and 2003). In Csibra et al. 2003, 12-month-old babies were habituated to a ball rolling along a path, apparently jumping while its path is hidden by a screen, and then continuing rolling along its path once it has emerged from behind the screen. In the test trials, the screen was removed and babies were shown one of two displays: one with an obstacle on the path, one with no obstacle. Longer looking times at the display with no obstacle indicate that jumping for no apparent reason is unexpected for the infant. In contrast, when there is an obstacle on the path, jumping over it is a direct and efficient means to achieving one’s goal and is therefore not a violation of expectation.

Another study by Gergely and his colleagues (2002) followed up on a finding of Meltzoff (1988) showing that 14-month-olds imitate the means an agent employs to attain a goal, even if those means are not the most direct or efficient. Meltzoff showed infants that tapping a panel light with his head made it light up. When babies returned to the lab the following week, they too used their heads to turn on the light, rather than simply pressing it with their hands. Gergely suggested that this seeming violation of rationality was not in fact irrational. He suggested that the baby might reason that if the light could be turned on with one’s hand, the adult they were imitating would have used his hand. The fact that the adult used his head to turn on the light suggests to the child that this must be a necessary means to achieve the goal. To test this hypothesis, the researchers added a condition in which the adult actor could not use his hands because they were otherwise engaged: the actor pretended to be very cold and used his hands to hold a blanket wrapped around him. With hands thus busy, the adult actor used his head to tap the panel light. They then compared the babies’ responses to the panel light in this hands-busy condition with the responses in the original Meltzoff condition where the actor’s hands were simply resting on the table. In the original Meltzoff condition, babies used their heads to turn on the light, but in the actor’s-hands-busy condition, the babies did not imitate the actor but instead used their hands. This supports the view that these babies already are acting on the basis of some principle connecting efficiency and goal-directedness, and that this principle is stronger than their tendency to imitate.

2.4.5 Belief and theory of mind

Let us return to the False Belief Task. It was noted earlier that children younger than 3-1/2 do not succeed in the classic paradigm. But in a recent study, Onishi and Baillargeon (2005) showed that infants as young as 13- to 15-months could succeed on a false belief task. In this study, babies were familiarized to a display of an adult placing a toy (a plastic watermelon slice) into one of two boxes and then reaching into the box as if to grasp it. The point of these familiarization trials was to indicate to the baby that reaching the toy was the adult’s goal. The toy was then moved from the box in which the adult had placed it to the other box. Although the baby always saw the toy move, and thus understood its new location, the adult did not always see the toy move; half the time, the adult’s view was blocked. The question is this: on the trials where the adult did not see the toy move to the new box—that is, when the adult had a false belief about the toy’s location—where do babies expect the adult to look for the toy? Looking time measures indicated that babies were surprised when the adult looked in the new box, even though babies knew it was the correct location. In contrast, on the trials in which adults saw the toy move to the new box, babies were surprised if adults did not look in the new location. At present there is no satisfactory account of why 3-year-olds fail the standard false belief task, given that 15-month-old babies seem to be able to attribute false beliefs to others. What else does the 3-year-old need, beyond what the 15-month-old already has, to succeed on the classic task? There are many candidate answers, but the Onishi and Baillargeon results have considerably changed the debate.

2.4.6 Animals and theory of mind

As noted above, questions about the development of a theory of mind were first posed with respect to chimpanzees, and it is to chimpanzees (and other nonhuman primates) that we now return. Until recently, most researchers agreed that there was little evidence to support the claim that nonhuman primates represented agency, goals, attention or the like (Povinelli 2000; Tomasello & Call 1997). However, chimpanzees, macaques, and other primates do follow eye gaze. Researchers have probed whether they appreciate the relationship between the direction of gaze and attention, or between seeing something and acquiring information. A number of recent studies have shown that chimps prefer to steal food from a person (or, in some conditions, a more dominant chimp) who cannot see them as opposed to a person (or more dominant chimp) who can (see Flombaum & Santos 2005; Hare et al. 2000; and Carey 2009 for review). If dedicated mechanisms to identify agents and to support our reasoning about them is part of our evolutionary heritage, as seems increasingly plausible, it should not surprise us to find them in some of our distant relatives—and in the very young.

Once again, the studies of newborn chicks are particularly illuminating. Regolin and colleagues (2000) habituated newborn chicks to a video display involving 2 balls, one red and one blue. At first the balls are presented as static. The red ball then moves, bumps into the blue ball, and then the blue ball moves. After habituation the chicks were presented with a fuzzy oval-shaped red ball and a fuzzy oval-shaped blue ball. The chicks imprinted to the red ball, not the blue one. It seems that they are sensitive to agency—that they see the red ball as an agent, while the blue ball may be a passive object. To make sure it was the red ball’s autonomous movement that was critical, experimenters partly occluded the red ball as it began its movement so that it wasn’t clear whether the movement was autonomous or set in motion by someone or something else. In this condition, the imprinting preference for the red ball disappeared. These chicks were newly hatched, so an explanation for these data that appeals to learning from sensory experience is unavailable. Once again, the chick studies provide an existence proof of an innately specified detection mechanism closely related to agency. Note that the question of what precisely the chick is detecting or representing is still open—is it autonomous motion or agency or some other property.

3. The Resurgence of Empiricism

The studies summarized in section 2 are representative of the Nativist resurgence. Not surprisingly, cognitive scientists with Empiricist sympathies continue to push back: to search for countervailing evidence, to question the methodologies involved in these studies, to develop alternative interpretations of the data, and so on. Moreover, as we mentioned at the outset, it is not only Nativism that has experienced a resurgence; there are important research directions in the cognitive sciences that seem inherently more friendly to the Empiricist position. In this section we briefly describe and contextualize some of these developments.

3.1 Connectionism and Dynamic Systems Theory

One important trend has been the development of Connectionism as an alternative to the ‘Classical’ conception of the mind (Newell & Simon 1976; see Garson 2010 for an overview). On the Classical view, the cognitive mind is best understood on the model of a digital computer that (i) uses symbolic representations that have a combinatorial syntax and semantics, and (ii) manipulates these representations following structure-sensitive processing rules. Connectionists replace the Classical view with a model of psychological processes as involving networks of simple units with weighted connections among the units that control the spread of activation through the network, and ‘learning’ algorithms for resetting the weights of the connections on the basis of earlier behavior of the network in response to some task. There is continuing debate about whether the Classical and Connectionist models are really incompatible, and some have argued that Connectionist systems are best viewed as implementations of classical symbol-based systems (see Pinker & Prince 1988 for discussion). But the research on psychological processing within the Connectionist framework is very different from what one finds in the Classical tradition.

Connectionism is relevant to the Nativism-Empiricism in two related ways. In the first place, Connectionism provides a natural format for the Empiricist idea that perception provides the basic elements of the mental system (ideas/network-nodes) and experienced regularities among ideas strengthens their connection (associations/weightings) and in this way accounts for learning. But a more important idea is that if Connectionism could be established as a real alternative to the Classical symbol manipulation approach (and not simply as providing implementations of Classical systems), it could help undercut a key argument of Chomsky-style Nativism. Here is a simplified version of the target argument.[26] Chomskyans, as we noted (section 1.1.2), argued that grammars—and by extension, the rules governing other domains of knowledge—are ‘psychologically real’. If they are, and the Classical view is correct, then it would seem that such rules are present in the mind as symbolic constructions. But these rules, as linguistic grammars make plain, involve abstract concepts that are not perceptually available in the data. So if the rules are symbolically represented, then these abstract concepts, which are the constituent elements of the rules, are also internally represented. But if the relevant concepts are not perceptually available, how could they be learned by Empiricist-style mechanisms that only track regularities in the stream of experience? This sort of Nativist argument was developed in Fodor 1981. Connectionism rejects the view of mental representation on which this argument depends. For the Connectionist, information is not in the mind as the semantics of mental symbols; as the meanings of terms in the language of thought. For the Connectionist, information is distributed as a pattern of weightings in a network in which none of the nodes represents anything. So: if this sort of anti-Classical Connectionist approach is successful, this particular version of the Poverty of the Stimulus argument for Nativism is blocked.

There is continuing controversy about whether Connectionism has in-principle limitations that disqualify it as a general model of cognitive processing (see Fodor & Pylyshyn 1988 and the literature this critique spawned, which is reviewed in Garson 2010). But there is a practical problem that is less controversial, and to understand it, we need to consider more closely how Connectionist nets learn. Imagine that one wants the net to learn the difference between (photos of) male and female faces. A set of input nodes will code the photo, activation will pass through a set of intermediate nodes, and an answer will appear on the output nodes. If the output on a particular input is incorrect (a male is misidentified as a female for example), the algorithm that governs the dynamics of the network automatically adjusts the weights of the connections between the various intermediate-layer nodes ‘in the right direction’ and more inputs are cycled through the system. When the output is correct over some range of inputs, the net has been ‘trained up’; it has successfully learned to tell male from female faces in photos. The art in Connectionist modeling is to discover the best network structure and the right algorithm for adjusting the weightings. The problem is that such networks learn very slowly; they often need hundreds of thousands of cycles of inputs, outputs, and weight adjustments. But humans and animals learn many things very quickly, sometimes even from one instance and often from a small set of instances (Garcia et al. 1955; Markman 1989).

One way to approach this discrepancy is to see it as due to the fact that in the typical Connectionist set up, the weights between nodes are initially set to random values, and are (very) slowly reset on the basis of small adjustments. But the fact that the initial weightings provide no prior information is arguably an artifact of the modeler’s Empiricist commitment to have all the learning ‘come from experience’. There is nothing in the general structure of Connectionist models that would prevent the modeler from starting with a highly constrained set of weightings—in this case one that already holistically contains information of the general features of human faces, and perhaps information about differences between male and female faces. The upshot, then, is that although most actual Connectionist models are Empiricist-friendly in their format and in their representational commitments, they can also be implemented in a way that is congenial to Nativist ideas. The prior information that the Nativist claims is part of the initial state of the organism can be realized by setting the initial patterns of weightings between the nodes in the network in such a way that learning will happen much more quickly. So while Connectionism may avoid the very general commitment to Nativism that some have argued is built into the Classical conception, it is neutral on the question of whether learning in a particular domain is wholly based on experience or uses innate information (suitably distributed across networks).

This last point applies to Dynamic Systems Theory approaches to cognition as well (Thelen & Smith 1994; Port & van Gelder 1995). Dynamicists hold that human behavior should be explained in terms of sets of differential equations that represent a subject’s trajectory in real time through a space of possible total cognitive-behavioral states. Because they, like anti-Classical Connectionists, reject the Classical paradigm’s commitment to symbol manipulation and computation, they also avoid the Nativist consequences of that view. But neither Connectionists nor Dynamicists are in principle anti-Nativist. However we model an organisms cognitive processes—as executing a Classical Von Neumann style program, as reassigning weights to nodes in line with a Connectionist back-propagation algorithm, or as moving through a Dynamicist state space as described by a set of differential equations—the question remains: what are the built in initial biases of the system and what role do they play in determining the steady state. One can construct a Connectionist system that is antecedently tuned to converge on a specific steady-state, and as such will have a significant Nativist element (Hummel & Biederman 1992 presents such a system for shape recognition). The same seems true of Dynamic Systems models. The oft-used Dynamicist example of a pendulum is ‘innately specified’ to reach a specific steady state (its point attractor) despite wide variability in its inputs. If very young children do indeed distinguish helpers from hinderers, for example, then this capacity will need to figure in the Dynamicist model. It will be appropriate to then ask about the role that the child’s initial structure or configuration played in its coming to have this capacity.

3.2 Bayesian approaches

Even at the height of Chomsky’s influence, it was clear that the strength of the Nativist position rested, to a great extent, on the weakness of the Empiricist alternative. The central argument from the Poverty of the Stimulus was that Empiricism had failed to make its case, and that the Nativist hypothesis was therefore more plausible. But it was implicit in this dialectic that if a more powerful Empiricist learning theory were developed, it could change the terms of the debate. Furthermore, Empiricists argued that there had to be a stronger general learning theory because learning theory as developed up until that time did not have the resources to account for much learning that was plainly based on experience (Harman 1967; Putnam 1967). Some would argue that these Empiricist hopes for a more powerful learning theory have been realized. Learning theory has advanced significantly, especially in the last decade, and Empiricism can now draw upon new resources; specifically, learning algorithms based on Bayes’ Theorem. The power of Bayesianism raises the possibility that the earlier Poverty of the Stimulus arguments underestimated what could be learned from experience by general learning mechanisms.

‘Bayesianism’ is a general term for a range of sophisticated statistical methods, algorithms, and tools that draw upon Bayes’ Theorem/Rule, which tells us how to revise our beliefs given new information; that is, how to choose the best of a set of alternative hypotheses given new data. The calculation requires (i) the prior probability of the data, (ii) the probability of the data given the hypothesis, and (iii) the prior probability of the hypothesis.[27]

The relevance of Bayes’ Theorem to Cognitive Science. Bayesianism is in its origins a normative theory of what one ought to believe under specific epistemic circumstances, and as such it has been applied extensively in understanding theory confirmation in the sciences. It first came to the fore in the cognitive sciences as an ideal against which one could measure human irrationality. Kahneman and Tversky (1972) famously showed that ordinary reasoners typically fall short of Bayesian standards when they are asked to decide the bearing of evidence on hypotheses, in part because they misjudge the relevance of the prior probability of the hypotheses. But in recent years, Bayesian methodologies have become a unifying framework for analyzing all aspects of cognition that can be represented as inference under uncertainty. For example, Bayesian ideas have been successfully applied to the processing underlying perception—especially the visual system (Knill & Richards 1996; Rao et al. 2002). In visual perception, a pattern of light hits the eye (the proximal stimulus), and the visual system needs to determine the nature of the visual scene in the environment (the distal stimulus) that caused that pattern. The proximal stimulus is compatible with a number of different distal stimuli. So the system faces something like the under-determination problem that a scientist faces. Both must select one view about what the world is like on the basis of information that still leaves other possibilities open. It turns out that Bayesian methods have been very successful at modeling how the visual system resolves these uncertainties.

The visual system gets an image on the retina (D), and must determine what the real-world scene is like (H). The image is compatible with many different possible scenes, but the visual system is very good at overcoming this uncertainty and reliably settles on the most likely scene. In Bayesian terms, the visual system must do this calculation:

\[ \mathrm{P}(\text{Scene}\mid\text{Image}) = \frac{\mathrm{P}(\text{Image}\mid\text{Scene})\mathrm{P}(\text{Scene})}{\mathrm{P}(\text{Image})} \]

Consider this (again simplified) example, drawn from Scholl 2005. In Figure 1, the circles are ambiguous; they can be either convex bumps or concave depressions. Viewers normally see (a) as convex and (b) as concave, (but if the display is turned upside down, the properties are reversed).

two grey rectangles with circles, one appears convex to most people and the other concave

Figure 1

The fact that we see these as we do can be explained in Bayesian terms. To figure out the most likely scene/source of (a), the visual system must assign a probability to the hypotheses \(H_2\) (that the circle in a is convex) and to \(H_2\) (that it is concave). One key assumption the visual system makes is that the scene in both (a) and (b) is illuminated by a single light source coming from overhead. So if the bottom of the circle is in shadow, we tend to see it as convex; if the top, we tend to see it as concave. When we look at (a), this assumption about the light source translates into the prior probability of \(H_1\) being higher than the prior probability of \(H_2\). So the priors in this case give us an antecedent ordering of the hypothesis space (here we ignore other hypotheses that could account for the image), and the visual system settles on (a) as convex.

Bayesian approaches are appealing because they provide a natural way to solve the problem that troubles theories, like Connectionism, that are built on associationist lines. Associationist learning is bottom-up. It depends on keeping track of correlations in the stream of experience and slowly modulating expectations on the basis of these correlations. But as we noted earlier, humans and animals learn about the world very quickly, and on the basis of a very small number of exposures and interventions. A child hears the word ‘horse’ applied to a few instances (and probably hears stray utterances of the word too) and reliably learns the extension of the term (Markman 1989). A rat made sick by a food one time, will not eat food with that smell again (Garcia et al 1955). These ‘fast-mappings’ are a problem for Associationist models. But they are more easily accommodated in Bayesian models, which essentially quantify the role of background knowledge—the top-down contribution—in the fixation of belief. If the rat already knows, as part of its background knowledge—its ‘factory settings’, so to speak—that when it comes to foods, smell is an indicator of edibility, then single-case learning is less mysterious. The prior probability of hypotheses linking edibility to smell may be antecedently set as very high, and hypotheses linking edibility to orientation may be set as very low. So one association between sick-making food \(f\) and smell \(s\) will be enough for the rat to ‘adopt the hypothesis’ that \(f\) and \(s\) are regularly linked. In contrast, if sick-making food \(f\) is always in a particular orientation \(o\), the rat may have a hard time making the connection even if it may be sensitive to orientations in other contexts. Similarly, if the child comes to the word-learning task with the assumption that new words most likely pick out unfamiliar extensions—again, with this assumption implemented in the priors—then her job is made easier. Bayes’ Theorem gives us a way to factor in this top-down background knowledge.[28]

The key issue in considering the bearing of Bayesianism on the Nativist-Empiricist controversy is the priors.[29] Where do they come from? If we are talking about simple, repeatable events like coin flips, the priors are a matter of well-defined relative frequencies given by probability theory. But the prior in the concave-convex case (which was chosen to highlight this point) seems to involve domain-specific facts about light and shadow, and their relation to the shape of objects. Scholl 2005 argues that the priors here are innate, and many scientists studying visual perception would agree. We don’t learn from experience that the objects in our perceptual world will typically have overhead illumination. Rather, this is one of the ‘factory settings’ of the visual system. As Kersten (2004) puts it (speaking more generally): ‘the priors are in the genes’. Ullman (1979) argues that the same may well hold for the general constraints relating the rigidity of objects to facts about motion. The view that the illumination constraint is innate is also supported by the fact that chickens reared in an abnormal illuminated-from-below environment still react as we do to stimuli (a) and (b) (Hershberger 1970). So we have evidence that this prior can be innate.

Let us assume that there are significant innate priors that operate in perceptual processing. Does this score points for the Nativist position in general? In one way it does, because it is in line with the basic Nativist theme that humans are tailored for their natural environment. But in another sense, the Empiricist might downplay the importance of this kind of Perceptual Nativism for the larger debate. Empiricists have always taken it for granted that we perceive as we do, in large part, because of our biological-psychological nature. The traditional Empiricist focus has usually been on that part of our understanding that goes beyond what we actually perceive. Its main claim is that anything that goes beyond what we perceive is constructed out of what we’ve perceived by domain-general principles. So even if (some of) the priors involved in Bayesian models of perceptual processing are innate, the more critical arena for the Nativist is domain-specific cognitive processing, to which we now turn. Nativists would expect that the best Bayesian models of cognitive processing would have to incorporate innate priors that reflect domain-specific knowledge. Empiricists would expect that domain-specific priors are themselves learnable by Bayesian methods from experience plus domain-general constraints on learning.

We do not yet know enough to settle these questions, but they are now beginning to be addressed. Most recently, a number of theorists have used Bayesian techniques to model not just low-level perceptual processing but also aspects of higher-order cognitive processes. Areas of current research include concept learning (Tenenbaum 1999), word learning (Xu 2007), and causal reasoning (Griffiths & Tenenbaum 2005; Griffiths et al. 2011), and the list is growing.[30] Contemporary research on the application of Bayesian techniques to higher-level cognition has generally ignored the battle lines of the Nativist-Empiricist debate. The real interest is in the possibility of developing statistical techniques that, as Tenenbaum et al 2006 puts it, “integrate bottom-up and top-down influences.” (For a very useful state of the art in Statistical learning along with concrete suggestions about deploying top-down information in such models, see Lake et al, forthcoming.) We already have sophisticated statistical analyses of the bottom-up part; the perceptual phenomena. The challenge is to develop quantitative representations and analyses of the levels of top-down background knowledge that operate in particular domains. In section 2, for instance, we considered as part of the child’s background information his theory of mind. It was on the basis of this theory that the child could develop a structural analysis of a situation in terms of agents, beliefs, goals, help/hindrance, and so on. But the information contained in such a theory and the structural analyses of particular situations that this theory makes available, cannot yet be integrated into Bayesian statistical analyses. The challenge for Bayesians is to develop ways to recast the top-down elements and the analyses they make available in quantitative terms. Only then will we be in a position to address whether and to what extent top-down information is learned or innate.

We can use the case of language understanding, a well-studied area and arguably a Nativist stronghold, to illustrate how these Bayesian goals might be achieved. The phenomenon is familiar: you hear a sentence \(S_1\) as having a specific meaning. The theoretical approach mirrors the vision case: \(S_1\) (as auditorily processed) is compatible with a number of competing structural representations, but your parser somehow chooses the best one: \(\mathrm{sr}(S_1)'\). The Bayesian says that the parser is able to do this because it can do a statistical analysis that integrates bottom-up and top-down information. In this case, the bottom-up element is \(S_1\). But the range of possible structured representations the parser can select from is top-down information, as is the algorithm that chooses \(\mathrm{sr}(S_1)'\) over other candidates. The problem: how to assign a prior probability to a complex structured representation like \(\mathrm{sr}(S_1)'\) (for instance, a syntactic tree)—a probability that depends on the probability assigned to the sub-elements. We know how to assign the prior probability of a series of heads for a fair coin. But the question comes up again: how do we assign a prior probability to a linguistic representation, or to a complex visual scene, or to a complicated representation of the goals, roles, and perceptual beliefs of a player in one of the Theory of Mind scenarios? The events are more complex, the representations of the events are therefore more complex, and the hypothesis space is more complex (Chater et al. 2006).

In the language case, the Bayesian can hope to draw on a good deal of what contemporary linguists have already achieved in understanding the structures underlying sentence comprehension, and some computational linguists are beginning to merge such analyses with probability theory (for instance, Chater & Manning 2006). But even here, the problem of finding the best structure to assign to an input is daunting. As Chater et al. 2006 puts it:

“More challenging is inferring representational structures over which parameters are optimized. One problem is that the space of possible structures is often large and discontinuous; a second is that a direct application of probabilistic methods would involve assessing each structure by integrating a prior over its parameters, which seems computationally prohibitive; a third is that structures appear to be constrained in potentially highly abstract ways.”

In the case of theory of mind, on the other hand, we don’t yet have developed theories about the relevant structures (but see the related work on causality collected in Gopnik & Schulz 2007). So it is only if Bayesians can get a handle on these representational and statistical problems, that they will be able to attack our question: how is the space of such structures generated in the first place? Is there innate domain-specific information at work or is there a Bayesian hierarchy, a two-level-up Bayesian account that explains how this one-level-up information is acquired (that is, a Bayesian learning-theoretic account that explains why the child represents linguistic input, for example, using tree structures, but integers in terms of a very different linear structure; for further discussion see Tenenbaum et al. 2011).

So, for example, children might know that animals are arranged in a taxonomy of a specific sort, and this prior background knowledge helps them learn about animals. But how do they get this prior? It might be that they have a prior higher-order principle \(P\) that provides a probabilistic ordering on different graph structures, and that the taxonomy they use has a higher prior probability than other ways to structure the animal world (say, a ring structure). But how do they get \(P\)? Do they learn it or is it simply there innately? To tackle these questions, all sorts of objects—structured representations one finds in a grammar, graph-structures one might find in a taxonomic representation of causal or kin relations, schemas applied to scene or event analysis, etc.—will need to be formalized and assigned probabilities. So there is much to be done. There is no a priori answer about how far up the Bayesian can go, and we do well to keep an open mind about the nature of the unlearned priors. But we should also not overlook findings like the chick’s stubborn presumption of illumination from above, which suggest that nature can build in unlearned priors, and that they can be domain specific. It would be, at the very least, extremely surprising if nothing like this operates in human psychology.

Bayesianism, then, focuses the Nativist-Empiricist question on the priors. First, we need to find out where the background knowledge brought to bear in any particular task comes from. Is some part of it innate, or can its presence be accounted for in terms of higher-order Bayesian learning? At some point, the Bayesian will come up against what is not learned by Bayesian methods (at the very least, the Bayesian machinery itself[31]), and we will want to understand its specific character. Will it be information implemented in our perceptual systems or domain-general information that applies no matter what is being learned, supporting the Empiricist view, or will some of it be tailored to specific ranges and domains of knowledge, vindicating the Nativist? We are still at the beginning of the road to the answers to these questions.

It is important not to underestimate the challenge that Empiricists face. The Bayesian formalism might make it seem that all that needs to be explained is why the hypotheses compatible with the data are ordered in the way they are. So, for example, if it could be shown that the bias for a light from above explanation of the retinal image (in section 3.2) could, in principle, have been learned from experience, then it might appear that the Empiricist wins the round. It is plausible that some of the priors relevant to scene recognition will be learnable in this way, so this may be right as far as it goes. But it does not get to the heart of the challenge.

Once we shift our attention away from the Bayesian formalism and focus on the fact that we are looking for an account of the cognitive/computational machinery that processes inputs and generates complex structural analyses of the world—think here of the subsystem that generates theory of mind representations of interacting agents in terms of goals, rationality, cooperation, desert, and so on—then it becomes clearer that the burden on the Empiricist is much greater. She must not only explain where the representation types come from, she must also account for the cognitive processing over these representation types. So even if there were a satisfactory Empiricist account of how the infant sorts inputs into the categories agent, hinderer, and so on, we still need a separate account of where the cognitive machinery that operates over these types comes from. Without that, the infant will have a static typology with no way to anticipate the dynamics of the situation. Here, then, the Empiricist seems to have two choices. She must make the case that the machinery involved is either (i) generic and domain-independent, or (ii) domain-specific, but itself the product of higher-order learning.

The point is that the Empiricist must account for the repertoire of representation types that figure in the processing and for the machinery that does the processing. The ultimate outcome of this debate will depend in part on how distinctive and complex the computational machinery turns out to be. If the engine for theory of mind dynamics only needs to do tree search or production rule application (for example), then the case for Empiricism is strengthened, because these are very general computational capacities. Only the content of the specific rules or the content at the nodes will be domain-specific. But the more idiosyncratic the computational machinery in a specific subsystem—for example, a simulator that anticipated motions and locations in a dynamic physical scene—the greater the challenge for the Empiricist to explain how such an internal computational device is acquired from experience alone.

As we noted, although Bayesianism has had a special appeal for Empiricists, one can use Bayesian methodologies and remain open to Nativist possibilities. There is nothing to prevent a Bayesian from starting with an innate system of representations and computational machinery and then using Bayesian algorithms to figure out how learning from experience might work for such a system. This approach is very much in line with the Core Nativist theorizing discussed in section 2. (For a detailed defense of this methodology, see Lake et. al. forthcoming).

In summary, then, Bayesianism appeals to Empiricists for at least two important reasons. First, because it reinstates learning from experience as a central process in cognitive development and change. This focus on learning contrasts sharply with the first wave of Nativist cognitive research, which, inspired by Chomsky’s work in linguistics, tended to assign a diminished role to learning from experience. Experience was thought to act as a trigger/releaser of innate information, or, as in some linguistic theorizing, as setting values to parameters that were left unspecified by our innate endowment. The lead role, again following Chomsky, was assigned to growth, understood simply as biological maturation. The second reason is that the current Bayesian mindset tends in some ways towards Empiricism. This is primarily because Bayesian learning can, at least in principle, be extended hierarchically, in the ways we’ve discussed. But Bayesianism also has some appeal to Nativists, because it focuses attention on the role of background knowledge in learning, and this is a theme that Nativists have pressed against bottom-up Associationist forms of Empiricism from the outset. Nativists can welcome a renewed focus on learning, and join in the development of Bayesian theories of cognitive development. So in the end, Bayesianism—as an approach to cognitive development—is, like Connectionism, compatible with Nativism. (For recent discussion of this last point, see Colombo 2017; for a more pessimistic assessment of the potential contribution of Bayesian approaches to psychology see Jones and Love 2011.)

Concluding remarks: Nativism and Rationalism

Nativism, as we have seen, is a vigorous program in contemporary cognitive science. But there is very little talk of Rationalism. The term is sometimes repurposed as another name for Nativism, and in some cases it is explicitly disowned, with Nativism taken to be the only plank in the original Rationalist platform worth saving. But following up on our previous discussion, there is a case to be made that this common attitude misses an important and distinctly Rationalist feature of current Core Nativist research. Here we briefly sketch the key ideas behind this claim.

The Classical Nativist-Empiricist debate is an expression of a disagreement about a bigger question: what is the (cognitive) mind and what is it for? For the Empiricist (Hume 1975/1738 being an especially clear example), the mind is first and foremost a pattern detector, and it is for prediction. For the Humean, all we have to work with is one experience and then another. Cognitive processing is at bottom experience mining: deploying our domain general cognitive machinery like statistical learning routines, memory retrieval, attentional mechanisms, associative connectivity, and so on, to detect patterns in the sequences of traces (Hume’s ideas) that experience leaves behind in memory.

For the Rationalist, mind is for understanding. Understanding is of course connected to pattern detection and prediction, but it also involves making sense of the patterns at some deeper level. The Classical Rationalist view is that Reason (with a capital ‘R’), in part embodied in our innate endowment, somehow makes this sort of deeper understanding possible. Descartes’ famous wax example (Descartes 1996/1641) is aimed at making clear the difference between detecting patterns in the flow of perceptual ideas that are prompted by a piece of wax and having a real understanding of what a piece of wax is. The Humean Empiricist rejects this search for depth as illusory; pattern detection is all understanding is or can be.

Our point is that the Core Cognition approach retains this distinctively Rationalist emphasis on understanding. Core systems like our intuitive physics and theory of mind help us construct models of the world based on innate abstract frameworks. By deploying these theories we can go beyond the input patterns and come to understand not just how things look, but also what they are, why they are as they are, why they change in the ways they do, how things might be if relevant parameters were different, and so on. When the baby sees triangles pushing a square up a hill, she constructs a rich conceptualization of the scene that breaks things up into different kinds of elements, and assigns properties to the elements; properties involving agency, physical object, goals, private intentions, information states, rationality, number, shape, etc.. She can use this conceptualization—her intuitive theory—to understand the dynamics of what she sees and in that way make sense of the situation.

Despite this commonality, Core Nativists are not one with Classical Rationalists. Descartes took our innate Rational notion of the physical to be at the heart of the true physics. Newton showed that this was wrong, and that we needed to go beyond our intuitive physics if we wanted to get a deeper understanding of the world. Core Knowledge theorists reject the Classical assumption that our innate sense-making frameworks are necessarily true. But true or not, what is innate provides a framework that we use to make sense of the world. To this extent, the Core Knowledge program revitalizes this key Rationalist idea. It remains an open question how we are to understand this notion of (deep) understanding. This search goes back at least as far as Plato. But our point here has not been to defend the revival of this Rationalist theme; only to note the commonality.

In conclusion, the studies that we surveyed in section 2 provide compelling evidence that we have been underestimating how much infants and young children understand about the world. At the same time, it is clear that adult competence goes far beyond the child’s in virtually every domain. The Bayesian framework we discussed in section 3 has the potential to address both issues at once. It provides a systematic and quantifiable approach to development, and is at the same time open to incorporating innate elements. Whether it will succeed in unifying a learning-theoretic approach to cognitive development with the built-in representations favored by Nativists remains to be seen.


  • Aguiar, A., & R. Baillargeon, 1999, 2.5-month-old infants’ reasoning about when objects should and should not be occluded. Cognitive Psychology, 39: 116–57.
  • Andrews, K., 2010, “Animal Cognition”, The Stanford Encyclopedia of Philosophy (Fall 2010 Edition), Edward N. Zalta (ed.), URL = <>.
  • Baillargeon, R., 1987, “Object permanence in 3½- and 4½-month-old infants,” Developmental Psychology, 23(5): 655–664.
  • Baillargeon, R., E. Spelke, & S. Wasserman, 1985, “Object permanence in 5-month-old infants.” Cognition, 20: 191–208.
  • Bechtel, G. & A. Abrahamsen, 1999, “The Life of Cognitive Science.” In Bechtel, W. & Graham, G. (eds.) A companion to cognitive science, pp. 1–104. Malden, Mass.: Blackwell.
  • Bloom, P., 2013, Just Babies: the Origins of Good and Evil. NY, Crown Publishing
  • Bloom, P. & K. Wynn, 2016, “What develops in moral development,” in Barner, D. & A Baron, eds., Core Knowledge and Conceptual Change. Oxford, OUP, 347–364.
  • Bloom, P. & T. P. German, 2000, “Two reasons to abandon the false belief task as a test of theory of mind,” Cognition. 77(1): B25–31.
  • Burge, T., 2010, Origins of Objectivity, New York, Oxford University Press.
  • Carey, S., 2009, The Origin of Concepts, New York, Oxford University Press.
  • Carey, S. & S. Gelman, 1991. The Epigenesis of mind: essays on biology and cognition, New Jersey, Lawrence Erlbaum Associates.
  • Carey, S. & E. Spelke, 1996, “Science and core knowledge,” Philosophy of Science, 63: 515–533.
  • Carruthers, P., 2006, The architecture of the mind: massive modularity and the flexibility of thought. Oxford: Clarendon Press.
  • Carruthers, P., S. Laurence, & S. Stich (eds.), 2005, The Innate Mind: Structure and Contents. New York: Oxford University Press.
  • ––– (eds.), 2007, The Innate Mind, Vol. 3: Foundations and the Future, New York: Oxford University Press.
  • Chater, N., J.B. Tenenbaum, & A. Yuille, 2006, “Probabilistic models of cognition: where next?” (Editorial), Trends In Cognitive Sciences, 10(7): 292–293.
  • Chater, N. and Manning, C.D., 2006 “Probabilistic models of language processing and acquisition,” Trends in Cognitive Sciences, 10(7): 335–344.
  • Chomsky, N., 1959. “Review of Verbal Behavior,” Language, 35: 26–58.
  • Chomsky, N., 1965. Aspects of the theory of syntax. Cambridge: M.I.T. Press.
  • Chomsky, N., 1966. Cartesian Linguistics: A Chapter in the History of Rationalist Thought. New York/London: Harper & Row.
  • Chomsky, N., 1975. Reflections on Language. New York: Pantheon.
  • Colombo, M., 2017. “Bayesian cognitive science, predictive brains, and the nativism debate”, Synthese DOI 10.1007/s11229-017-1427-7.
  • Cowie, F., 2010 “Innateness and Language”, The Stanford Encyclopedia of Philosophy (Summer 2010 Edition), Edward N. Zalta (ed.), URL = <>.
  • Csibra, G., S. Biro, O. Koos, & G. Gergely, 2003, “One-year-old infants use teleological representations of actions productively,” Cognitive Science, 27: 111–113.
  • Csibra, G., G. Gergely, S. Biro, O. Koos, & M. Brockbank, 1999, “Goal attribution without agency cues: The perception of ‘pure reason’ in infancy,” Cognition, 72: 237–267.
  • Darwin, C., 1998/1872, The Expression of the Emotions in Man and Animals, London: Harper Collins.
  • Dehaene, S., 1997, The number sense. Oxford University Press, Penguin Press, New York, Cambridge UK.
  • Dehaene, S., V. Izard, E. Spelke, & P. Pica, 2008, “Log or linear Distinct Intuitions of the number scale in Western and Amazonian Indigene Culture,” Science 320(5880): 1217–20.
  • Dehaene, S. & J. Mehler, 1997, “Numerical transformations in five month old human infants,” Mathematical Cognition, 3(2): 89–104.
  • Dennett, D., 1978, “Beliefs about beliefs (commentary on Premack & Woodruff),” Behavioral and Brain Sciences, 1: 568–70.
  • Descartes, R., 1911/1647, “Notes Directed Against a Certain Program”, in Haldane and Ross, eds. The Philosophical Works of Descartes, Cambridge, Cambridge University Press.
  • Descartes, R., 1996/1641, Meditations on First Philosophy, translated by John Cottingham, Cambridge: Cambridge University Press.
  • Doherty, M. J., 2009, Theory of Mind: How Children Understand Others’ Thoughts and Feelings. Hove, UK: Psychology Press.
  • Downes, S., 2010 “Evolutionary Psychology”, The Stanford Encyclopedia of Philosophy (Fall 2010 Edition), Edward N. Zalta (ed.), URL = <>.
  • Feigenson, L., S. Carey, 2005, “On the limits of infants’ quantification of small object arrays,” Cognition, 97(3): 295–313.
  • Ferrari, P.F., E. Visalberghi, A. Paukner, L. Fogassi, A. Ruggiero, et al., 2006, “Neonatal Imitation in Rhesus Macaques,” PLoS Biol 4(9): e302. doi:10.1371/journal.pbio.0040302. [Available online]
  • Flanagan, O., 1991, Science of the Mind, 2nd edition, Cambridge, MA, MIT Press.
  • Flombaum, J. I. & L. R. Santos, 2005, “Rhesus Monkeys Attribute Perceptions to Others,” Current Biology, 15(5): 447–452.
  • Fodor, J. A., 1981, “The Present Status of the Innateness Controversy,” in Representations: philosophical essays on the foundations of cognitive science, pp. 225–253. Array Montgomery, Vt.: Bradford Books.
  • –––, 1983, The Modularity of Mind. Cambridge, MA: MIT Press.
  • Fodor, J. & Z. Pylyshyn, 1988, “Connectionism and Cognitive Architecture: a Critical Analysis,” Cognition, 28: 3–71.
  • Frisch, K. von, 1953, The Dancing Bees. Harcourt, Brace & World, Inc: New York, NY.
  • –––, 1971, Bees: their vision, chemical senses, and language. Array Ithaca: Cornell University Press.
  • Gallistel, C.R., 2000, “The Replacement of General-Purpose Learning Models with Adaptively Specialized Learning Modules,” in M.S. Gazzaniga, Ed. The Cognitive Neurosciences. 2d ed. (1179–1191) Cambridge, MA. MIT Press.
  • Garcia J, D. J. Kimeldorf, & R. A. Koelling, 1955, “Conditioned aversion to saccharin resulting from exposure to gamma radiation,” Science, 122(3160): 157–8.
  • Garson, James, 2010, “Connectionism”, The Stanford Encyclopedia of Philosophy (Winter 2010 Edition), Edward N. Zalta (ed.), URL = <>.
  • Gelman, R. & C. R. Gallistel, 1978, The Child’s Understanding of Number, Cambridge, MA: Harvard University Press.
  • Gergely, G., H. Bekkering, & I. Kiraly, 2002, “Rational imitation in preverbal infants,” Nature, 415(6873): 755.
  • Gopnik, A. & L. Schulz, (eds.) 2007, Causal Learning: Psychology, Philosophy, and Computation, Oxford University Press.
  • Griffiths, T.L. & J. B. Tenenbaum, 2005, “Structure and strength in causal induction,” Cognitive Psychology, 51: 334–384.
  • Griffiths, T.L. & A. Yuille, 2006, “Technical introduction: a primer on probabilistic inference,” Trends in Cognitive Sciences, 10(7). Supplement to special issue on Probabilistic Models of Cognition.
  • Griffiths, T.L., D.M. Sobel, J.B. Tenenbaum, & A. Gopnik, 2011, “Bayes and blickets: Effects of knowledge on causal induction in children and adults,” Cognitive Science, 35(8): 1407–1455.
  • Gross, S. and G. Rey, 2012, “Innateness,” in The Oxford Handbook of Philosophy of Cognitive Science, Eric Margolis, Richard Samuels, and Stephen P. Stich (eds.), Oxford: Oxford University Press.
  • Hamlin, J.K., 2015, “Does the infant possess a moral concept?”, in Margolis, E. & S. Laurence, eds., The Conceptual Mind: New Directions in the Study of Concepts, Cambridge, MA, MIT Press.
  • Hamlin, J. K., K. Wynn, & P. Bloom, 2007, “Social evaluation by preverbal infants,” Nature, 450: 557–559.
  • Hare, B., J. Call, B. Agnetta, & M. Tomasello, 2000, “Chimpanzees know what conspecifics do and do not see,” Animal Behaviour, 59: 771–786.
  • Harman, G., 1967, “Psychological aspects of the theory of syntax,” Journal of Philosophy, LXIV: 75–87.
  • Hauser, M.D., F. Tsao, P. Garcia, E.S. Spelke, 2003, “Evolutionary foundations of number: spontaneous representation of numerical magnitudes by cotton-top tamarins,” Proceedings of the Royal Society of London B, 270(1573): 1441–1446.
  • Hauser, M. D., 2006. Moral minds: how nature designed our universal sense of right and wrong. Array New York: Ecco.
  • Helmholtz, H. von, I867/1962, Treatise on physiological optics (J. P. C. Southall, Trans.). New York: Dover.
  • Hershberger, W., 1970, “Attached-shadow orientation perceived as depth by chickens reared in an environment illuminated from below,” Journal of Comparative and Physiological Psychology, 73(3): 407–11.
  • Hook, S. (ed.), 1969, Language and Philosophy: A Symposium. NY: NYU Press.
  • Hume, D., 1975/1738, A Treatise of Human Nature, ed. L.A. Selby-Bigge, rev. P.H. Nidditch, Oxford: Clarendon Press.
  • Hummel, J & I. Biederman, 1992, “Dynamic binding in a neural network for shape recognition,” Psychological Review, 99(3): 480–517.
  • Izard, V., C. Sann, E. S. Spelke, and A. Streri, 2009, “Newborn infants perceive abstract numbers,” PNAS, 106(25): 10382–10385.
  • Jones, M. & B.C. Love, 2011. “Bayesian fundamentalism or enlightenment? On the explanatory status and theoretical contributions of Bayesian models of cognition (with replies)”, Behavioral and Brain Sciences, 34, 169–231. doi:10.10176/S0140525X10003134
  • Joyce, J., 2008. “Bayes’ Theorem”, The Stanford Encyclopedia of Philosophy (Fall 2008 Edition), Edward N. Zalta (ed.), URL = [].
  • Joyce, R., 2006, The Evolution of Morality Cambridge, MA, MIT Press
  • Kahneman, D. & A. Tversky, 1972, “Subjective probability: A judgment of representativeness,” Cognitive Psychology, 3(2): 430–454.
  • Kersten, D., P. Mamassian, & A. Yuille, 2004, “Object perception as Bayesian inference,” Annual Review of Psychology, 55: 271–304.
  • Knill, D. & W. Richards (eds.), 1996, Perception as Bayesian Inference. Cambridge University Press.
  • Koechlin, E., S. Dehaene, & J. Mehler, 1998, “Numerical transformations in five-month-old human infants,” Journal of Mathematical Cognition, 3: 89–104.
  • Kuhlmeier, V.A., P. Bloom, & K. Wynn, 2004, “Do 5-month-old infants see humans as material objects?” Cognition, 94: 95–103.
  • Kuhlmeier, V.A., K. Wynn, & P. Bloom, 2003, “Attribution of dispositional states by 12-month-olds,” Psychological Science, 14: 402–408.
  • Kuhn, T., 1962/1996, The Structure of Scientific Revolutions, 3rd edition, Chicago: University Of Chicago Press.
  • Lake, B.M., Ullman, T.D., Tenenbaum, J.B., & S.J. Gershman, forthcoming. “Building Machines that learn and think like people”, Behavioral and Brain Sciences, ms.
  • Laurence, S. & E. Margolis, 2007, “Linguistic Determinism and the Innate Basis of Number,” in P. Carruthers, S. Laurence, & S. Stich, 2007, 139–169.
  • Leibniz, G.W., 1981/1764, New Essays on Human Understanding. Translated by Peter Remnant and Jonathan Bennett. Cambridge: Cambridge University Press.
  • Leslie, A. M., C. R. Gallistel, & R. Gelman, 2007, “Where integers come from,” in P. Carruthers, S. Laurence, S. Stich, 2007, 109–138.
  • Lipton, J.S. & E. Spelke, 2003, “Origins of number sense: Large number discrimination in human infants,” Psychological Science, 15(5): 396–401.
  • Locke, J. & P. H. Nidditch, 1979/1690, An essay concerning human understanding. Oxford: Clarendon Press.
  • Luo, Y., 2011, “Three-month-old infants attribute goals to a non-human agent,” Developmental Science, 14(2): 453–460.
  • Luo, Y. & R. Baillargeon, 2005, “Can a self-propelled box have a goal? Psychological reasoning in 5-month-old infants,” Psychological Science, 16: 601–608.
  • Luo, Y. & S. C. Johnson, 2009, “Recognizing the role of perception in action at 6 months,” Developmental Science, 12: 142–149.
  • Mahajan, N. & K. Wynn, 2012, “Origins of ‘us’ versus ‘them’: prelinguistic infants prefer similar others,” Cognition, 124: 227–233.
  • Markman, E., 1989, Naming and Categorization in Children. MIT Press.
  • Marler, P., 1991, “The instinct to learn” in Carey, S. & Gelman, S. eds., The Epigenesis of Mind: Essays in Biology and Cognition, pp.305–330. Lawrence Erlbaum Associates, Hillsdale, NJ.
  • McCrink, K. & K. Wynn, 2004a, “Large-number addition and subtraction by 9-month old infants,” Psychological Science, 15(11): 776–781.
  • –––, 2004b, “Ratio abstraction by 6-month-old infants,” Paper presented at the International Conference on Infant Studies, Chicago.
  • McGilvray, J. A., 2005, “Introduction” in J. A. McGilvray (ed.) The Cambridge Companion to Chomsky, pp. 1–18. Cambridge, UK: Cambridge University Press.
  • Meltzoff, A.N., 1988, “Infant imitation after a 1-week delay: Long-term memory for novel acts and multiple stimuli,” Developmental Psychology, 24(4): 470–476.
  • –––, 2005, “Imitation and other minds: The ‘like me’ hypothesis,” in S. Hurley & N. Chater (eds.), Perspectives on imitation: From neuroscience to social science (Vol.2, pp. 55–77). Cambridge, MA: MIT Press.
  • Meltzoff, A.N. & M. K. Moore, 1977, “Imitation of facial and manual gestures by human neonates,” Science, 198: 75–78.
  • Menn, L. & M. M. Vihman, et al., 2003, “Acquisition of Language”, International Encyclopedia of Linguistics. William J. Frawley. 1992, 2003 Oxford University Press.
  • Mikhail, J., 2008, “The Poverty of the Moral Stimulus” in Moral Psychology, Vol. 1: The Evolution of Morality: Innateness and Adaptation, Walter Sinnott-Armstrong, ed., pp. 353–360, Cambridge: MIT Press)
  • Nagel, T., 1997, The last word. New York: Oxford University Press.
  • Newell, A., & H. A. Simon, 1976, “Computer science as empirical inquiry—Symbols and search,” Communications of the ACM, 19(3): 113–126.
  • Onishi, K. H., & R. Baillargeon, 2005, “Do 15-month-old infants understand false beliefs?” Science, 308: 255–258.
  • Piaget, J., 1928, The Child’s Conception of the World. London: Routledge and Kegan Paul.
  • –––, 1936/1952, The Origins of Intelligence in Children. New York: International University Press.
  • –––, 1954, The construction of reality in the child. New York: Basic Books.
  • –––, 1977, H. E. Gruber & J. J. Voneche (eds.), The essential Piaget. New York.
  • Piattelli-Palmarini, M. (ed.), 1980, Language and learning: the debate between Jean Piaget and Noam Chomsky, Cambridge, MA: Harvard University Press.
  • Pinker, S., 2002, The blank slate: the modern denial of human nature. New York: Viking.
  • Pinker, S. and P. Bloom, 1990, “Natural language and natural selection,” Behavioral and Brain Sciences, 13(4):707–784.
  • Pinker, S. and A. Prince, 1988, “On Language and Connectionism: Analysis of a Parallel Distributed Processing Model of Language Acquisition,” Cognition, 23: 73–193.
  • Plantinga, A., 1993, Warrant and proper function. New York: Oxford University Press.
  • Port, R. & T. van Gelder, 1995, Mind as Motion: Explorations in the Dynamics of Cognition, MIT Press, Cambridge, MA.
  • Povinelli, D. J., 2000, Folk physics for apes. New York: Oxford University Press.
  • Premack, D. & G. Woodruff, 1978, “Does the chimpanzee have a theory of mind?” Behavioral Brain Sciences, 1: 515–526. (Dennett’s response is included in the issue.)
  • Putnam, H., 1967, “The ‘Innateness Hypothesis’ and explanatory models in linguistics,” Synthese, 17: 12–22. Reprinted in H. Putnam, 1975, Philosophical Papers, Vol. 2, 107–16.
  • Pylyshyn, Z., 2007, “Multiple object tracking”, Scholarpedia, 2(10): 3326. [Available online].
  • Quine, W. V., 1960, Word and object. [Cambridge]: Technology Press of the Massachusetts Institute of Technology.
  • Rao, R., B. Olshausen, & M. Lewicki, (eds.), 2002, Probabilistic Models of the Brain: Perception and Neural Function. MIT Press.
  • Regolin, L., G. Vallortigara, & M. Zanforlin, 1995, “Object and spatial representations in detour problems by chicks,” Animal Behavior, 49: 195–199.
  • Regolin, L., L. Tommasi, & G. Vallortigara, 2000, “Visual perception of biological motion in newly hatched chicks as revealed by an imprinting procedure,” Animal Cognition, 3: 53–60.
  • Rips, L.J. & Hespos, S.J., 2015, “Divisions of the physical world: concepts of objects and substances,” Psychological Bulletin, 141, 786–811.
  • Robbins, P., 2010, “Modularity of Mind”, The Stanford Encyclopedia of Philosophy (Summer 2010 Edition), Edward N. Zalta (ed.), URL = <>.
  • Santos, L.R. & M. D. Hauser, 1999, “How monkeys see the eyes: Cotton-top tamarins’ reaction to changes in visual attention and action,” Animal Cognition, 2: 131–139.
  • Scholl, B., 2005, “Innateness and (Bayesian) visual perception,” in Carruthers et al. 2005, pp. 34–52.
  • Simon, T., S. Hespos, & P. Rochat, 1995, “Do infants understand simple arithmetic? A replication of Wynn, 1992,” Cognitive Development, 10: 253–269.
  • Skinner, B. F., 1957, Verbal Behavior. New York: Appleton-Century-Crofts.
  • Spelke, E., 1990. “Principles of object perception,” Cognitive Science, 14, 1, 29–56
  • –––, 1998. “Nativism, Empiricism, and the origins of knowledge,” Infant Behavior & Development, 21(2): 181–200.
  • –––, 2000, “Core knowledge,” American Psychologist, 55: 1233–43.
  • –––, 2003, “What makes us smart: core knowledge and natural language,” in D. Gentner & S. Goldin-Meadow, (eds.), Language in Mind: Advances in the Study of Language in Thought, Cambridge, MA., MIT Press, 277–311.
  • Spelke, E., K. Breilinger, J. Macomber, & K. Jacobsen, 1992, “Origins of knowledge,” Psychological Review, 99: 605–632.
  • Spelke, E. S., R. Kestenbaum, D. Simons, & D. Wein, 1995, “Spatiotemporal continuity, smoothness of motion and object identity in infancy,” The British Journal of Developmental Psychology, 13: 113–142.
  • Street, S., 2006, “A Darwinian Dilemma for Realist Theories of Value.” Philosophical Studies 127:109–166.
  • Teglas, E. Vul, V. Girotto, M. Gonzalez, J. B. Tenenbaum, L. L. Bonatti, 2011, “Pure Reasoning in 12-Month-Old Infants as Probabilistic Inference,” Science, 332(6033): 1054–1059.
  • Tenenbaum, J.B., 1999, “Bayesian modeling of human concept learning,” Proceedings of the 1998 Conference on Advances in Neural Information Processing Systems, 11: 59–65.
  • –––, (ed.), 2006, “Special Issue: Probabilistic Models Of Cognition,” Trends in the Cognitive Sciences, 10(7): 287–291.
  • Tenenbaum, J.B., T. L. Griffiths, & C. Kemp, 2006, “Theory-based Bayesian models of inductive learning and reasoning,” in Tenenbaum, ed., 2006, pp. 309–318
  • Tenenbaum, J.B., C. Kemp, T. L. Griffiths, & N. D. Goodman, 2011, “How to grow a mind: statistics, structure, and abstraction,” Science, 331: 1279.
  • Thelen, E. & L. B. Smith, 1994, A Dynamic Systems Approach to the Development of Cognition and Action, Cambridge, MA: The MIT Press.
  • Tomasello, M., & J. Call, 1997, Primate cognition, Oxford, U.K.: Oxford University Press.
  • Ullman, S., 1979, “The interpretation of structure from motion,” Proceedings of the Royal Society of London, B, 203(1153): 405–426
  • Wellman, H.M., D. Cross, & J. Watson, 2001. “Meta-analysis for theory-of-mind development: the truth about false belief,” Child Development, 72(3): 655–84.
  • Wood, J.N., & E. Spelke, 2005, “Infants’ enumeration of actions: Numerical discrimination and its signature limits,” Developmental Science, 8(2): 173–181.
  • Woodward, A.L., 1998, “Infants selectively encode the goal object of an actor’s reach,” Cognition, 69(1): 1–34.
  • –––, 1999. “Infants’ ability to distinguish between purposeful and non-purposeful behaviors,” Infant Behavior and Development, 22: 145–160.
  • –––, 2005, “The infant origins of intentional understanding,” in R. V. Kail (ed.), Advances in child development and behavior (Vol. 33, pp. 229–262). Amsterdam: Elsevier.
  • Wynn, K., 1992, “Addition and subtraction by human infants,” Nature, 358: 749–750.
  • Xu, F., & E. S. Spelke, 2000, “Large number discrimination in 6-month old infants,” Cognition, 74(1): B1–B11.
  • Xu, F., 2007, “Rational statistical inference and cognitive development,” in Carruthers et al. 2007, pp. 199–215.

Other Internet Resources

[Please contact the author with suggestions.]

Copyright © 2017 by
Jerry Samet <>
Deborah Zaitchik <>

This is a file in the archives of the Stanford Encyclopedia of Philosophy.
Please note that some links may no longer be functional.
[an error occurred while processing the directive]