Defining a Language Module - The Language Module Reconsidered

The Adaptable Mind: What Neuroplasticity and Neural Reuse Tell Us about Language and Cognition - John Zerilli 2021

Defining a Language Module
The Language Module Reconsidered

7.2.1 The meaning of linguistic specialization

There is a clear consensus in modern neuroscience that language is mediated by “defined sets of circuits” (Fisher 2015, pp. 150—151). The main debate over these circuits concerns whether they are specific to language (Chomsky 2005, 2010). In Chapter 3, I raised the possibility that, despite extensive evidence of the reuse of neural circuits and what appears to be the deeply interpenetrative nature of mental functions, some small component or set of components is rarely coopted outside the language domain. Such a component (or set thereof) would be strictly specialized for language in its being recruited predominantly, perhaps even exclusively, for linguistic purposes. By way of example, I mentioned the possibility of a neuron or restricted set of neurons’ being dedicated to conjugating the verb “to be” and having no nonlinguistic functions at all (other examples are discussed in § 7.2.3). I said that this component might aptly be described as a language “module.” The debate over the specialization of linguistic function, then, can be understood as a debate concerning the existence of such modules (Fedorenko & Thompson-Schill 2014). It is an important question in its own right, of course (cf. Fitch 2010), but it carries further implications for other inquiries into the human cognitive system, as well as for the evolution of language. Among the various alternative ways of construing the issue (as to which, see later in this chapter), this is the understanding with which I shall proceed here. Let me, however, define the problem more precisely before I turn to address it directly in the following sections of this chapter.

So far, we have seen how the evidence of neural reuse strongly suggests that the only dissociable unit we are likely to encounter in the brain will be one that resembles the neuroscientific notion of a module. The neuroscientific module is sometimes called a “brain module” or “cortical module” (Mountcastle 1978, 1997; Pascual-Leone & Hamilton 2001; Gold & Roskies 2008; Rowland & Moser 2014; Zador 2015); at other times a “cortical column” or “columnar module” (Mountcastle 1978, 1997; Buxhoeveden & Casanova 2002; Amaral & Strick 2013; Zador 2015), at still other times an “elementary processing unit” (Kandel & Hudspeth 2013), or simply an “operator” (Pascual-Leone & Hamilton 2001; Pascual-Leone et al. 2005). It corresponds (very roughly) with the node of a neural coactivation graph and is known to perform only exiguous subfunctions such as aspects of edge detection or depth discrimination—certainly nothing as high-order as language acquisition or norm acquisition per se. High-order complex functions are instead enabled by neural ensembles or composites, which are just so many arrangements of these low-level neural modules, often highly distributed across the cortex (therefore not localized, contrary to much traditional speculation). But we also saw that, owing to the effects of the many different neural contexts in which modules appear (namely, the functional assemblies instantiating high-level complex functions), it is not clear that such units will always possess the requisite degree of specialization required to sustain their modularity: in many cases, the label “module” may actually be a misnomer. The true extent of modularity in the cortex—even with the benefit of a neuroscientifically informed conception to hand—is very much an open question. As a way of coming to grips with this issue, in Chapter 5 I provided a scale of specificity for brain regions that makes their indicia of specificity explicit. Situating the question of the modularity of language within this framework sharpens the issue considerably and shows up useful points of contrast with alternative construals. Varying degrees of modular specialization can be represented along a continuum running from A to E, each with the indicia as specified in Table 5.1 (p. 79). Brain regions at or to the left of C, which marks the onset of weak context effects, will be sufficiently specialized to count as modular. Brain regions to the right of C, characterized by strong context effects, will not. Again, plasticity increases as one moves from A through D.

The search for a language module may be construed as a search for a type B module. Let us call such a type B language module an “elementary linguistic unit,” or “ELU.” It will also be remembered that in § 5.3, I provided a notation to describe the entities in view here. A true module (any of types A through C) is a certain sort of network of neurons, which I called an “M-network” (for convenience we may regard all of the types A through E as M-networks, even though the paradigm cases encompass only A through C). A “C-network” is the composite structure that brings the several modules implicated in a complex function into coalition. Language as a gross psychological capacity is mediated by a restricted class of C-networks (e.g., a speech comprehension network, a speech production network, etc.). This much is beyond dispute. The proponent of a language “module” needs to show in addition that at least one of these language C-networks’ constituents is an ELU. Indeed, the traditional claim is more ambitious, with theorists maintaining that there is in effect a large M-network that handles core aspects of language—a super-sized ELU, as it were—such as Chomsky’s Merge or Fodor’s sentence parser (see later in this chapter) (Chomsky 1980a, pp. 39, 44, 1988, p. 159, 2002, pp. 84—86; Fodor 1983; Plaut 1995; Pinker & Jackendoff 2005, p. 207; Fitch et al. 2005, p. 182; Collins 2008, p. 155; Fedorenko & Thompson-Schill 2014). The argument of the present chapter is that there are unlikely to be any ELUs—that the only units we are likely to find among the constituents of our language C-networks are M-networks of the types C through E.

Now it may seem that this construal of the matter is austere, and that I have set a most demanding test for the modularity of language. Other ways of understanding linguistic modularization have occasionally been discussed. Broadly speaking, these fall into two categories, one of which might be called a “neurological” or “structural” understanding, the other a “psychological” one (by now a familiar distinction). The neurolinguists Evelina Fedorenko and Sharon Thompson-Schill (2014) explain them using network concepts. The structural notion is the more or less conventional one I have just described, which looks for an ELU. As they put it, “a network may be functionally specialized for mental process x if all of its nodes are functionally specialized for x, [but] perhaps the presence of at least one functionally specialized node is sufficient to qualify the whole network as functionally specialized” (2014, p. 121). In this view, even a single ELU would suffice as evidence of the specialization of language.

The psychological approach, on the other hand, would count as specialized any system whose pattern of interconnections between nodes is unique to the function the system performs:

In this approach, the properties of the nodes are less important; they may be functionally specialized, domain general, or a mixture of the two. What matters is whether a unique combination of nodes and edges is recruited for the relevant mental process x. If so, such a network would be considered functionally specialized for x, even if all of the individual nodes are domain general . . . and even the same exact combination of nodes can contribute differently to different mental processes when the nodes are characterized by different patterns of connection. (Fedorenko & Thompson-Schill 2014, p. 121)

In this much more liberal view, language is specialized if the patterns of connections that characterize its C-networks are unique to those networks, notwithstanding that the same (indeed even the very same) nodes are recruited beyond the language domain, provided that the wiring patterns are distinctive in each case. Translated into psycholinguistic terms, a linguistic capacity (e.g., phoneme discrimination, lexical retrieval, etc.) would count as specialized and hence domain-specific if its representations were unique to the linguistic domain.2 For example, if representations of speech sounds or lexical items are unique to language, in that they are consumed only while an agent is engaged in linguistic tasks, they would count as domain-specific. Intuitively, it will be easier to show that a capacity is domain-specific in this psychological sense than it will be in the neurological/structural sense: for it is unlikely that the linguistic representations involved in (say) phoneme discrimination, lexical retrieval, or syntactic parsing are consumed outside the specific contexts of linguistic communication. By the same logic, chess playing and golf will employ domain-specific representations. Of course, few would deny the importance of network configurations when explaining cognitive functions, or that there are occasions when our attention is properly captured by the dynamics of distinct (and in that sense, “specialized”) networks. But it would surely surprise no one that the brain enters into a different state whenever it switches between tasks, or even that such states are reasonably consistent across individuals. Systems specialized in this sense lack the stability and permanence underwriting the sort of specialization likely to be of interest to those in search of a language module. What has predominantly mattered to these researchers is just the extent to which mental processes like language rely on structurally dedicated mechanisms and specific computations.

John Collins, for instance (a philosopher and noted defender of generative linguistics), conjectures that “the peculiar specificity of language deficits suggests that the realization of language is found in dedicated circuitry, as opposed to more general levels of organization” (Collins 2008, p. 155). Chomsky himself has written that “It would be surprising indeed if we were to find that the principles governing [linguistic] phenomena are operative in other cognitive systems. . . . [T]here is good reason to suppose that the functioning of the language faculty is guided by special principles specific to this domain” (Chomsky 1980a, p. 44). Barely a decade later, he wrote that “[i]t would be astonishing if we were to discover that the constituent elements of the language faculty enter crucially in other domains” (Chomsky 1988, p. 159). Many commentators (e.g., Goldberg 2003; Pinker & Jackendoff 2005) frequently assume that Chomsky has relented in his stridency concerning this requirement, but in fact he has continued to hold out for the potential vindication of “earlier versions of generative grammar” in this regard (see, e.g., Fitch et al. 2005, p. 182, and the ambivalent remarks in Chomsky 2010, p. 53). For despite the abstractness of the Minimalist Program—which simplifies the idealization to language in the interests of evolutionary tractability—Chomsky has continued to write of a “language organ” that is “analogous to the heart or the visual system or the system of motor coordination and planning,” commenting approvingly of the view that regards specialized learning mechanisms as “organs within the brain” that are “neural circuits whose structure enables them to perform one particular kind of computation” (Chomsky 2002, pp. 84—86). Thus linguistic specialization for Chomsky relates both to structure and to function, which is presumably why he continues to refer to “the radical (double) dissociations that have been found since [Eric] Lenneberg’s pioneering work in the 1950s” (Chomsky 2018, p. 28). Pinker and Jackendoff (2005, p. 207) also defend something like this, pointing to neuroimaging and brain damage studies suggesting that “partly distinct sets of brain areas subserve speech and non-speech sounds,” evidence that speech perception “dissociates in a number of ways from the perception of auditory events.”

For this reason I have construed the issue of linguistic specialization along more traditional lines.3 I turn next to the other aspect of the problem of defining a language module.

7.2.2 The domain of language clarified

In one sense, defining the language domain ought to be a simple affair, for is it not just the domain that encompasses activities such as speaking and signing, and (on a broader plane) reading and writing? The straightforward answer to this is yes, but the complete picture is somewhat more complicated by the deep and really rather mysterious relationship between thought and language. It is clear that language expresses a speaker’s thoughts, and that whatever many other purposes a language may serve, it always comes down to the ability to convert sound (or some other signal) into meanings, and meanings into sound (Chomsky 1980b, p. 46; Sterelny 2006, p. 24; Jackendoff 2007, p. 2; Christiansen & Chater 2016, pp. 114—115). From this perspective, it is natural to view language as serving some sort of coding function, and the language faculty as a cognitive system that enables translation between “mentalese” and strings of symbols (Pinker 1994, p. 60). In such a view, there would seem to be at least two (potentially overlapping but functionally distinct) interacting systems of interest: a thought or “central” system on one hand, and a coding or translation system on the other.4 One system generates and processes thoughts; the other encodes and decodes them. The second system takes its input from the first during production tasks, while the first takes its input from the second during comprehension tasks. This is admittedly crude and schematic; there are also many who would question the aptness of a conduit metaphor for language (Evans & Levinson 2009, pp. 435—436; Smit 2014). Nonetheless, I think the picture is reasonable. As Justin Leiber (2006, pp. 30—31) puts it, the “commonplace distinction that psychologists and linguists use [takes] speaking and hearing to be ’encoding’ and ’decoding’—i.e., converting thoughts, or mental items, into the physical speech stream, and converting the physical speech stream into thoughts, or mental items.” Certainly a more useful analogy in the present context would be hard to find, since disputants in the debate over linguistic modularity can be roughly grouped in accordance with how broadly they construe the language domain—as we shall see, there are those who would have it encompass (or even be reduced to) thought, and those who would restrict it to the coding function alone.

Chomsky’s (1965, 1975, 1979, 1980a, 1995, 2002, 2005, 2010, 2016) many iterations of the language module have one thing in common in their portrayal of a central system that encompasses the very mechanisms of thought (Chomsky 2018; McGilvray 2014, p. 59; Collins 2004, p. 518). In a collaborative paper, Hauser, Chomsky and Fitch (2002) distinguished between the faculty of language in a narrow sense (FLN) from the faculty of language in a broad sense (FLB). FLN as a subset of the mechanisms underlying FLB is “the abstract linguistic computational system alone, independent of the other systems with which it interacts and interfaces” (Hauser et al. 2002, p. 1571). Their assumption is that “a key component of FLN is a computational system (narrow syntax) that generates internal representations and maps them into the sensory-motor interface by the phonological system, and into the conceptual-intentional interface by the (formal) semantic system” (Hauser et al. 2002, p. 1571). Furthermore, “a core property of FLN is recursion,” which yields discrete infinity and is suggested to be the only uniquely human and uniquely linguistic cognitive possession (Hauser et al. 2002, p. 1571). The property of discrete infinity allows the generation of a limitless array of hierarchically structured expressions from a finite base of elements—the same property that (it is alleged) generates the system of natural numbers (Chomsky 2005, 2010). The technical term for this operation is “Merge,” which in its simplest terms is just set formation (Berwick & Chomsky 2016, pp. 10, 98). Merge combines words (“Lexical Items”) and sets of words, taking their semantic information (“features”) to a semantic interface (SEM—the “conceptual-intentional system”) and their sound information to a phonetic interface (PHON—the “sensory-motor system”). Merge is therefore a system that generates sentences (“expressions”) in an inner symbolic code or language of thought (an “I-language”) (Chomsky 2005, pp. 3, 4, 2010, pp. 55, 59; see also Chomsky 2018).

It is important to be clear about what conception of language lies behind this proposal. It is easy to be misled by talk of a phonetic interface, the mappings to that interface, and indeed the whole sensory-motor apparatus, which along with the semantic system is supposed to be a system for linking sound and meaning. This tends to imply that the production of an acoustic signal for the purpose of externalization and communication is what language is for. But this is actually only “the traditional assumption” (Chomsky 2010, p. 54). The “primary relation” of interest is supposed to be that between the core faculty of language (FLN) and SEM; i.e., the “systems of thought” (Chomsky 2010, pp. 54—55). Expressions that satisfy the interface conditions of SEM yield a “language of thought,” and it is hypothesized that “the earliest stage of language,” which supposedly arose prior to externalization, was “just that: a language of thought, available for use internally” (Chomsky 2010, p. 55). This inner code was the unique possession of a privileged individual, Prometheus, whose language provided him with “capacities for complex thought, planning, interpretation, and so on . . . [which] would then be transmitted to offspring, coming to predominate” (Chomsky 2010, p. 59). It is easy to forget that because externalization and communication came later, the language of Prometheus was not just a silent inner speech, as the residue of an internalized conventional public symbol system might be. Rather, it is hypothesized to be something like the reflexively complex but wordless stream of thought available to (presumably) any member of Homo sapiens not yet exposed to a public language.5 For language is “virtually synonymous with symbolic thought” (Chomsky 2010, p. 59, quoting Ian Tattersall), and “fundamentally a system of thought” (Berwick & Chomsky 2016, p. 102). Perhaps the clearest indication that for Chomsky, language is the acme of central cognition, is recent remarks suggesting that language functions as a means of integrating information from various proprietary domains: “language is the lingua franca that binds together the different representations from geometric and nongeometric ’modules,’ just as an ’inner mental tool’ should. Being able to integrate a variety of perceptual cues and reason about them . . . would seem to have definite selective advantages” (Berwick & Chomsky, pp. 165—166). This makes Prometheus’ language a “language of thought” in pretty much the classical sense (Fodor 1975). Thus, when Chomsky implores us to consider how difficult it is not to talk to ourselves, both during sleep and our almost every waking hour (Berwick & Chomsky 2016, p. 64), to press the point that language is really an instrument of thought, it is important not to assume (no matter how reasonably) that he is extolling the virtues of a public language. The powerful scaffolding a public language provides in the form of an echo for our ideas and ruminations—the chance to objectify and insinuate our thoughts into a manipulatable format external to ourselves, surely what makes language able to serve as a “tool for thought” par excellence—cannot be denied, of course, and Chomsky certainly does not (e.g., Berwick & Chomsky, p. 102). But his primary aim here is not to make the case for externalization so much as to point up the intimate and virtually indissoluble relation between a Promethean private language and internal thought. For language here ultimately means something other than what most people, and I suspect what most language researchers, think about when they think about language (see later, this chapter). Most researchers would understand the coding function to be a distinct system for the translation of thought into the sentences of a public language, even if this system can be decomposed into elements that are shared with other systems (including systems of thought). Now just what all this implies for an ELU we shall come to presently, but first let me contrast Chomsky’s view with Jerry Fodor’s, who seems to have a more conventional—Chomsky would say “traditional”—understanding of what I have called the coding function.

Fodor has consistently maintained that only peripheral input systems are likely to be modular. In this view, modules are associated with specific channels of sensory transduction—there may be modules for vision, olfaction, and even aspects of syntactic processing, but probably not for complex thought, memory, and judgment. I have two points to make about this, the first somewhat ancillary to the second. In light of what I have discussed in previous chapters, this way of construing the difference between central and peripheral systems seems definitely mistaken. The material I presented in Chapters 2 through 5 demonstrates that elements of even our most evolutionarily ancient transduction systems participate in various cross-domain functional composites (C-networks), including those underlying central processes. Transduction dynamics, which are usually characterized by a certain degree of speed, autonomy, or reflexivity, may even be activated in many cases by the same domain-general nodes (M-networks/modules) that yield central system dynamics. This might in fact explain the frequent penetrability of perception. Thus it is no longer really plausible to cash out the difference between central and peripheral system dynamics in terms of modules, for just the same sorts of entities seem to underlie both types. Of course, Fodor does not in specific terms exclude the possibility that cognition might be underwritten by anatomically or functionally exiguous units throughout—the basic assumption in cognitive neuroscience. It is just that he has construed the term “module” to mean something quite specific: a device for the processing of transduced information. This construal simply does not contemplate the autonomous, domain-general columns that handle low-level subfunctions right across the neocortex, long understood to be the seat of complex thought and executive function. But Fodor does not own the term, and the modular hypothesis—under that very name and always referring to the functionally specialized units of the mind/brain—goes back at least to the 1950s, appearing in works by Vernon Mountcastle (1957, 1978), David Marr (1976), and Noam Chomsky (1980a) well before the appearance of Fodor’s (1983) monograph.6 As Collins (2004, p. 506) summarizes the Fodorian attitude to the central systems: “for Fodor, whether there are ’central’ modules is at best moot; the thesis that it’s all modules he considers to be virtually a priori false.”

The point about Fodorian modularity I want to emphasize, however, is not that I think it draws a distinction that is somewhat arbitrary at the level of modularity (it may be more aptly drawn at some other level of inquiry; e.g., an evolutionary one); it is that his understanding of modularity leads directly to a certain kind of language module, one very different from Chomsky’s (Collins 2004). Since modules for him are peripheral input devices, it follows that any language module must be peripheral, and thus not the sort of system that generates expressions in an inner symbolic code, as Chomsky’s does. Fodor’s language module is a “sentence encoding-decoding system”—a parser, with an encapsulated representation of grammar (Fodor et al. 1974, p. 370).7 Language is for him “a psychological mechanism that can be plausibly thought of as functioning to provide information about the distal environment in a format appropriate for central processing” (Fodor 1983, p. 44). In this account, language is not a central process, not pure symbolic thought, as it is for Chomsky; rather it is a “psychological mechanism” that provides grist for the central system mill (i.e., for the inner “language of thought”). The cleavage between Chomsky and Fodor, then, relates to the difference between “knowledge” and “use” of language, or “competence” and “performance” (Chomsky 2018). Chomsky’s module is an internal generative device that accounts for our knowledge of language (competence). Fodor’s is a peripheral input device that accounts for our use of language (performance).

All this can make for confusion in debates about the modularity of language. It is not hard to see how interlocutors might talk past one another. Does a mechanism recruited exclusively for thought, or perhaps for thought and a more peripheral coding operation—but nowhere else across cognition—count as an ELU? Or must the mechanism be exclusive to the coding operation alone before it can be considered an ELU? It depends on whether you view systems of thought as forming part of the domain of language. Evidently some do and others do not. Take metarepresentation as a case in point: the capacity for nested thinking that allows us to embed thoughts within thoughts, in principle indefinitely, witnessed in a child’s being able to draw a picture of themselves drawing a picture (Suddendorf 2013; Zerilli 2014). If it could be shown that metarepresentation is an exclusive property of thought, or an exclusive property of thought and the coding function taken together, metarepresentation would count as an ELU in a Chomskian interpretation of language (defined in terms of thought). For someone with a more traditional understanding, in contrast, metarepresentation would not count as an ELU, for although it might appear in the coding function, it is exploited outside the language domain (defined in terms of processes that operate distinctly from thought), in this case within the systems of thought.

Morten Christiansen and Nick Chater are two psycholinguists who appear to have the more traditional understanding of the language domain in mind. Among the various factors they cite to explain why natural languages appear to be so well suited to the human brain, and hence easy to learn and process, they include “constraints from thought” (Christiansen & Chater 2016, p. 51). This form of explanation makes most sense from the point of view that language and thought are not synonymous (otherwise the explanation would be uninformative). It is just as well, then, that Christiansen and Chater do indeed regard “constraints from thought” as “non-linguistic constraints” (2016, p. 50). To give a vivid sense of the confusion that these differences of view have engendered, compare Christiansen and Chater’s (2016) book with Berwick and Chomsky’s (2016) book. Berwick and Chomsky maintain that “Universal Grammar” is language-specific, whereas Christiansen and Chater deny that the core operations in language processing abilities are language-specific, seeing them as merely applications of general-purpose, non-hierarchical, sequence learning abilities. However, as Richard Moore (2017, p. 611) pointed out in his illuminating review, “the nature of [the] disagreement here must be stated carefully because of the different ways in which BC and CC use the word ’language.’ ” He goes on:

BC think that language, in the form of the conjunction of Merge and the conceptual interface, is an element of thought that underwent natural selection for improvements in planning. While for them language is independent of communication, for CC language is just the set of natural languages—and their use of the word reflects this. Given their view, BC would not expect that areas of the brain involved in natural language use would be used only for natural language; they will be used for general purpose thinking too. This makes their thesis less easy to distinguish from CC’s. . . . If language is understood in terms of natural language, then neither BC nor CC hold that syntax is language-specific. ( Moore 2017, pp. 611—612)

If the interlocutors themselves paid more attention to these distinctions, the precise nature of their disagreements could be better understood.

While I shall adopt the more traditional construal of the language domain (à la Christiansen and Chater), in § 7.3 I will survey evidence of the extensive reuse of language circuits across domains having nothing much to do with either language or thought. In other words, the material I present herein should be problematic for anyone defending the existence of an ELU, regardless of how eccentrically they wish to construe the language domain.

7.2.3 Examples of elementary linguistic units

Before leaving this section, I should provide some further guidance on the most likely candidates for the role of an ELU. Now that we have clarified both in what respects an ELU would be specialized and in what sense it could be linguistic, we can turn to some concrete proposals.

Much of the impetus for the claim that the mind/brain contains ELUs came from early work in generative linguistics, which formalized a large stock of highly intricate and apparently system-specific rules for the derivation of grammatical strings (“surface structures”) from the more abstract “kernel” sentences (“deep structures”) underlying them (Chomsky 1956, 1957). These unspoken deep structures were hypothesized to be “present to the mind” whenever a speaker produces the surface forms of her language (Chomsky 2006, p. 16). This inspired the belief that the mind/brain incorporates specialized systems that function more or less exclusively for the generation of surface structures. While the field of generative linguistics today would hardly be recognizable to an undergraduate familiar with work from (say) the mid—late 1960s, the influence of that early work has not dissipated entirely—and it is, for all that times have changed, still plausible to suppose that at least some linguistic operations are domain-specific. Let me illustrate with a simple example drawn from the generative tradition.

The assignment of phonetic interpretations to surface structures might hint at cognitive resources that, by virtue of how detailed and context-specific they seem, could reasonably be supposed to serve no other function. Assume that a speaker has encountered the following phonetic realizations:

expedite → expeditious

contrite → contrition

ignite → ignition

Assume further that the speaker has not yet encountered the word “righteous,” so has not yet been in a position to establish the derivation

right → righteous

The speaker on hearing “righteous” (properly so as “rahy-chuh-s”) for the first time knows that the underlying form cannot be the same as for expeditious, contrition, and so on (unless the case is just an exception), though had the speaker heard “rish-uh-s” they would not have hesitated in concluding that “rite” would be the underlying form (analogously to expedite/expeditious, etc.). The speaker understands that the underlying form of “righteous” must instead be “right” (or, more technically, a form containing “i” followed by the velar continuant “gh”), for only some such form could make sense of what was heard given the following rule (which the speaker must be taken to know):

“t” followed by a high front vowel [“-eou,” “-iou,” “-ion,” “-ian,” etc.] is realized as “ch” [as in chew, choke, challenge, etc.] after a continuant [e.g., “—ahy,” as in fight, bight, sight, etc.—as opposed to “i” as in fit, bit, sit, etc.], and as “sh” [as in shoe, show, sham, etc.] elsewhere.

Detailed phonological rules of this kind—in fact much more intricate ones than this—have frequently been thought to reflect principles not obviously assimilable to other cognitive domains, pertaining exclusively to the coding function. This accompanies the thought that such rules are so exotic as far as the agent’s overall envelope of capacities go that handling them must require a very special suite of neural and computational resources.

Pinker and Jackendoff (2005) suggest other rules. They observe that many grammatical principles have no real application outside language, principles such as linear word order (John loves Mary/Mary loves John), agreement (the boy runs vs the boy run), case (John saw him vs John saw he), tense, aspect, derivational morphology (run/runner), and inflectional morphology (the girls are/the girl is). Moreover, they contend that linguistic recursivity is not reducible to analogues in mathematics. They also nominate speech perception as possibly uniquely adapted for the perception of human speech sounds (and not other types of sounds). Brattico and Liikkanen (2009, p. 261), in passing, suggest that the lexicon, as “a list of feature bundles,” is domain-specific. Their argument is in fact that the only truly domain-specific aspects of language will turn out to be nongenerative—generative mechanisms (recursion/Merge) will be domain-general. Once again, though, it is not always clear what is meant by “domain-specific.” While Pinker and Jackendoff seem to have the more robust sense of dissociability in view—that is, at least in part a structural notion—one could interpret “domain-specific lexicon” to imply merely that lexical representations are domain-specific, a considerably weaker notion. To be clear (and at the risk of repetition), it is only the former (structural) notion that interests me here.

Actually, the question of what it takes to be domain-specific, or “specialized for X,” can be a little more complicated. For instance, associative learning is the paradigm domain-general cognitive capacity (in both the structural and psychological senses). But a particular learned association, say between fire and warmth, could be considered domain-specific in yet another (somewhat misleading) sense: the specific associative mechanism linking fire and warmth may be discretely localized in the brain and active only in response to those specific stimuli. Similarly, we might have a general capacity to run recursive algorithms, but a parallel implementation of that procedure, say a numerical one, would be “domain-specific.” It is therefore important to distinguish between a general capacity, and a specific, repeated and (potentially) parallel use of that capacity. I will return to this important distinction in § 7.5 when I discuss neural redundancy.