Is Language Innate? - The Language Module Reconsidered

The Adaptable Mind: What Neuroplasticity and Neural Reuse Tell Us about Language and Cognition - John Zerilli 2021

Is Language Innate?
The Language Module Reconsidered

Let me begin with a couple of straightforward observations. Infants begin their lives with a remarkable ability to detect and respond to subtle acoustic distinctions that vary considerably across the world’s languages. Within a short time—indeed, before 10 months of age (Kuhl & Damasio 2013)—and pursuant to a powerful and in some respects still mysterious learning process, they come to recognize statistical properties in the acoustic stream, form phonetic categories, distill words and possibly inflectional items, and assimilate the basic phrase structure of their mother tongue. By one year of age, they appear able to comprehend simple imperatives like “Show me your nose” (Glickstein 2014). And even though the linguistic environment is not so impoverished as was once believed (Clark 2009, p. 368; Pullum & Scholz 2002; Scholz & Pullum 2002), it is still striking that most, if not all, of this gets underway without explicit instruction or drilling. Overall, adopting the terminology and concepts introduced in Chapter 6, let us admit that language acquisition involves a certain degree of developmental robustness—not quite like that of the visual system or the growth of wings on birds (or the growth of limbs, or the onset of puberty, or any of the other rhetorical claims made in the past), but something to reckon with, nonetheless. The appeal of a concept like robustness is that it admits of degrees, so that to confess that a system is characterized by robustness does not commit one to implausible claims. Actually, developmental robustness is somewhat reminiscent of the notion of being acquired under a “poverty of stimulus,” because to be so acquired just is to develop independently of the presence of some specific environmental stimulus or stimuli, and hence denotes a sort of invariance with respect to experience (Griffiths & Machery 2008, pp. 406—407). Drawing this link is acceptable as long as the locution is employed with care, and with the understanding that acquisition under poverty of stimulus is a relative phenomenon, not an absolute one (see Chapter 6).

The question now is whether this developmental profile reopens the debate over the existence of an ELU. Does the fact that language seems to be acquired so early in life, with fair uniformity over substantial variations of intelligence and experience, without specific training, and so on, call for the postulation of a domain-specific language module? The move is reasonable, but unnecessary, for several reasons. It is worth remembering that however robust language acquisition might be, even in the most ideal conditions it can take a long time to complete (up to ten years or more). There is anyway a more natural and parsimonious explanation for why language acquisition proceeds at the ontogenetic pace it does, and why it often seems that children attain mastery of their native language almost effortlessly. The explanation lies in the mutual accommodation (or fit) between the processing dispositions of the brain regions used in language and language itself. There is evidence both that language was culturally shaped (as a “cultural tool”) to be learnable and easy to process through cultural evolution (Everett 2012; Christiansen & Chater 2016; Laland 2016) and that selective pressures in the course of biological evolution may have equipped the brain with the sorts of processing dispositions and biases that made language easier to learn and process (Dor & Jablonka 2010; Sterelny 2006; Christiansen & Chater 2016; see also Laland et al. 2015). Since cultural evolution is much the most important side of this story, I shall have a little more to say about that than about biological evolution. But before I go any further, let me frame the main point of this section in terms of Stanislas Dehaene’s (2005) “neuronal recycling” hypothesis, which we met briefly in Chapter 3.

Recall that towards the end of Chapter 6, I observed that while there is a relative sense in which modules and other functionally significant brain regions can be considered innate, the same cannot be said for the higher-level cognitive functions composed of them. Cast in terms of reuse, low-level cognitive workings may be innate, but it does not follow from this that higher-level cognitive uses are innate. Most complex cognitive functions are learned throughout the course of a person’s life—whether it be riding a bicycle, tying one’s shoes, or reading, these skills do not spontaneously unfurl as a result of intrinsically determined developmental processes, so it makes sense to withhold the designation “innate” or “robust” from the C-networks that implement them. Why, then, is language acquisition different from reading, performing long division, or learning physics? What is it about conversation that entitles us to regard it (and its C-network) as in some sense sharing in or inheriting the robustness of its components (workings/M-networks)? This is where Dehaene’s notion of a “neuronal niche” is useful. Cultural acquisitions must make their home among a particular ensemble of cortical regions (a C-network), and this process is akin to the process of organisms’ creating their own ecological niches among the habitats in which they find themselves. Just as organisms must make the best use of the resources at their disposal, so cognitive organisms (i.e., cultural acquisitions) are constrained by the processing dispositions of the brain regions required for the tasks at hand. We have already seen that brain regions do have robust processing capabilities and clear input preferences. Dehaene’s idea is that the more the acquired practice matches the processing dispositions of the brain regions recruited for the task, the easier and less disruptive the learning process, because the neural composite does not require a radical departure from existing cortical biases. On the other hand, the greater the distance between the acquired practice and the processing dispositions of the brain regions it will draw upon, the more difficult and protracted the learning process will be, potentially disrupting the regions’ established operations and whatever functional composites they already subserve.

It is not hard to see how this account would dovetail nicely with a cultural evolutionary account revealing the ways in which language has been shaped over many hundreds of generations to be learnable and easy to process. If human languages have in fact been so worked upon as to make them easy to learn and use, the neuronal niche that languages must nuzzle into already ideally conforms to the sorts of processing demands that languages impose on language users. It is just as well, then, that there is just such a cultural story to tell! Brighton et al. (2005) call it “cultural selection for learnability.” If the rudiments of syntax, phonology, morphology, and so on are to survive from one generation to the next, they must earn their keep. If they are too cumbersome or exotic to be readily learned, taken up, and transmitted to the next generation, they will be discarded for simpler and more streamlined or efficient devices. There is mathematical modeling to suggest that compositionality could have evolved in this fashion, for instance (Smith & Kirby 2008; Kirby et al. 2007).

It is hard to deny that human languages are cultural products (Everett 2012). And if so, it makes perfect sense that they will reflect the cognitive and neural dispositions of the agents that created them in their own image. As Christiansen and Chater (2016, pp. 43—44) explain:

In other cultural domains, this is a familiar observation. Musical patterns appear to be rooted, in part at least, in the machinery of the human auditory and motor systems . . . art is partially shaped by the properties of human visual perception . . . tools, such as scissors or spades, are built around the constraints of the human body; aspects of religious beliefs may connect, among other things, with the human propensity for folk-psychological explanation.

They identify and elaborate upon four groups of nonlinguistic constraints that they conjecture would have guided the cultural evolution of language. (Much of this can be read as a natural extension of the ideas in § 7.3 concerning the reuse of language circuits.) They divide the constraints here between those arising from thought, perceptuo-motor factors, memory, and pragmatics. For example, on the assumption that thought is “prior to, and independent of, linguistic communication,” key properties of language such as compositionality, predicate—argument structure, quantification, aspect, and modality can be “proposed to arise from the structure of the thoughts language is required to express” (2016, p. 51). Cognitive linguists have made the dependence of language on thought a critical feature of their perspective, arguing that our basic conceptual repertoire, including space and time, can be seen to have left their mark on the structure and categories of the world’s languages (Croft & Cruise 2004; Evans & Green 2006). Perceptuo-motor constraints have also left their mark, most obviously in “the seriality of vocal output,” which “forces a sequential construction of messages” (2016, p. 52). Christiansen and Chater speculate that

The noisiness and variability . . . of vocal . . . signals may, moreover, force a “digital” communication system with a small number of basic messages: e.g., one that uses discrete units (phonetic features, phonemes, or syllables). The basic phonetic inventory is transparently related to deployment of the vocal apparatus, and it is also possible that it is tuned, to some degree, to respect “natural” perceptual boundaries. (2016, p. 52)

The extent of the connections here can be taken quite far. MacNeilage (1998), for example, has offered the intriguing hypothesis that syllabic structure might have been partly determined by the jaw movements involved in mastication! While not immediately obvious, on reflection it seems likely that many complex aspects of phonology and morphology will be traced to similarly prosaic origins. Memory constraints are hardly less significant; seen, for instance, in the tendency to resolve linguistic dependencies (e.g., between arguments and their verbs) as early as possible, “a tendency that might not be syntax-specific, but instead an instance of a general cognitive tendency to attempt to resolve ambiguities rapidly whether for linguistic . . . or perceptual input (Christiansen & Chater 2016, p. 53). Finally, pragmatic constraints must have wielded a hefty influence on many aspects of language design—Levinson (2000) showed that discourse and anaphora appear to be related, so it is plausible that aspects of binding theory could be accounted for in terms of pragmatics. In all these ways, and without doubt very many more (including ways yet to be explored—a monumental undertaking, really), language has been “shaped by the brain,” naturally and parsimoniously explaining the child’s relative ease of acquisition and the intimate relationship between the child’s innate endowment and the structure of language.

Before concluding this section, I should indicate something of the process of mutual fit and accommodation as it occurs in the other direction. While language has been predominantly shaped by the brain, to be sure, in certain limited respects it is at least likely that the brain has been shaped through selection pressures for language. In the previous section, I mentioned that in order for adaptations to arise, evolution requires a stable environment, and that adaptations for language specifically would require a linguistically stable environment. I also said that linguistic and other cultural environments are in the nature of things quite unstable, and that given these contingencies, when it comes to cultural environments, plasticity is typically favored over robustness. This is just to say that unstable environments are conducive to the sorts of nervous systems that exploit the same resources for alternative ends, so that the cognitive mechanisms that are selected for in such circumstances will typically be flexible enough to be put to alternative uses (Avital & Jablonka 2000; Dor & Jablonka 2010). Now is as good a time as any to reference the well-known phenomenon of “niche construction,” part of the “Extended Synthesis” in evolutionary biology (see Laland et al. 2015 and Laland et al. 2011 for reviews). Niche construction is a specific instance of the broader process of gene—culture coevolution (Boyd et al. 2011; Richerson & Boyd 2005). The essential difference is that the process is cumulative. Organisms are always altering their environments to better suit their needs, whether by creating nests, burrows, or dams, and so on. In the case of humans, these environmental modifications extend to the social and cultural worlds that encompass language. The changes wrought in these ways inevitably modify the selection pressures acting on organisms and so facilitate adaptation to the new environments they have created, which organisms will inevitably alter further still, which leads to new selection pressures, and so on and on in a virtuous cycle of organism-directed environmental and cultural modification and adaption that results in organisms’ being increasingly better adapted to the material, social, and cultural worlds of their own making. It is highly likely that cognitive mechanisms evolved for language in this manner (not ELUs, however: see next paragraph), particularly to the extent that we can identify universal, stable features across linguistic environments (such as a constrained range of phonemic units arranged combinatorically and with duality of patterning). Laland (2016, p. 5) conjectures that “[i]mportant elements of infant-directed speech, such as infants’ sensitivity to its linguistic features, or adults’ tendency to engage in behaviour that elicits rewarding responses from infants (e.g., smiles), have been favoured through a biological evolutionary process.” Adding to the list of adaptations that would have been crucial in the evolution of a language faculty, we could cite the ability to represent symbolically (Deacon 1997), the ability to reason about other minds (Malle 2002), the ability to engage in pragmatics (Levinson 2000), increased working memory (Gruber 2002), an increased domain-general capacity for learning words (Bloom 2000), and modifications to the human vocal tract (descended larynx, etc.) (de Boer 2016).

It is vital to stress that in respect of none of these adaptations can we say that we are we dealing with an ELU—language may have provided the occasion for selection, but there is no evidence that these mechanisms are used exclusively for language, and indeed overwhelming evidence that the brain simply “doesn’t work that way”: virtually no cortical structure, not even the visual cortex(!), is so insensitive to experience that it resists all cooption during development. Rather, the evidence points to a brain that integrates all sorts of brain regions within the neural ecology for the management of organism—environment interactions, even where these regions might by nature be disposed to processing particular sorts of inputs over others. This makes good evolutionary sense, being overall “a more efficient use of metabolically expensive brain matter” (Anderson 2014, p. 46). Even the structure of the vocal apparatus has uses outside the language faculty (in music and meditation, for example).

One last thing: cognitive adaptations relevant to a specific domain like language may require no more than a simple change to synaptic connection patterns; for instance, a genetic event that entrenches a pattern of connections between a set of preexisting domain-general modules (Ramus & Fisher 2009, p. 865). This is in fact just what the theory of reuse entails, at least for many cases involving the emergence of novel traits—to the extent that the theory holds that it will often be easier to mix and match existing elements than to have to evolve them afresh each time a new evolutionary challenge arises, the theory implies that specific combinations of neural elements (which have perhaps proved their value developmentally) will be selected for. How else can a specific arrangement of preexisting domain-general modules be entrenched other than through a robust synaptogenetic process of some description or another (see § 7.5 on “search”)? Thus it could be that some parts of the language C-network, perhaps even large parts, are already wired up and ready to go, even though the modules within the network are entirely domain-general. Preformed connections would surely result in a smooth period of language learning, even given “relatively slight exposure and without specific training” (Chomsky 1975, p. 4).