The Brain Module - Modules Reconsidered: Varieties of Modularity

The Adaptable Mind: What Neuroplasticity and Neural Reuse Tell Us about Language and Cognition - John Zerilli 2021

The Brain Module
Modules Reconsidered: Varieties of Modularity

As I have already mentioned several times in passing, neuroscience gets by for the most part with a very specific notion of modularity at hand. This is not the sense in which modules are familiar in network science, nor the sense in which they are familiar in most of psychology and cognitive science. The neuroscientific module is sometimes called a “brain module” or “cortical module” (Mountcastle 1978, 1997; Pascual-Leone & Hamilton 2001; Gold & Roskies 2008; Rowland & Moser 2014; Zador 2015), at other times a “cortical column” or “columnar module” (Mountcastle 1978, 1997; Buxhoeveden & Casanova 2002; Amaral & Strick 2013; Zador 2015), still at other times an “elementary processing unit” (Kandel & Hudspeth 2013), or simply an “operator” (Pascual-Leone & Hamilton 2001; Pascual-Leone et al. 2005). As I foreshadowed earlier, it corresponds (very roughly) with the node of a neural coactivation graph.8 Slight variations in the meanings of these terms will not be important in the present context. It is true that the cortical “column” forms part of a distinctive hypothesis in neuroscience that arguably contemplates a narrower class of phenomena than is conveyed by the nodes of a network graph. But nothing need turn on this here. Indeed, from one point of view, the metamodal (reusable) node is a fully generalized account of the more specific columnar module (Jacobs 1999, pp. 33—34; Pascual-Leone & Hamilton 2001, pp. 427—428, 441, 443).

Various formulations of the criteria for modularity have been proposed in neuroscience (Buxhoeveden & Casanova 2002, p. 940). The general notion is of a coherent functional unit with a more or less dedicated input—output specification, somewhat on a par with the modern microprocessor chip (Leise 1990, p. 1). Gazzaniga (1989, p. 947) assumes “a high degree of functional specificity in the information transmitted over neural systems,” and that modular organization consists of “identifiable component processes that participate in the generation of a cognitive state. The effects of isolating entire modular systems or of disconnecting the component parts can be observed” (my emphasis). Leise (1990) defines “module” as a group of cells with similar response properties (see also Amaral & Strick 2013, p. 348; Zador 2015, p. 44). Krubitzer (1995, p. 412) defines them as “structural and physiological discontinuities within the limits of a classically defined cortical field . . . reflected in architectonic appearance . . . neural-response properties, stimulus preference and connections.” The idea here is clearly predicated upon both functional and anatomical specificity.

The brain module’s explanatory rationale is simple. As Gazzaniga (1989, p. 947) concludes from a review of the comparative evidence, “research on animals has led to the belief that there are anatomic modules involved in information processing of all kinds and that they work in parallel and are distributed throughout the brain.” In the same vein, Kandel and Hudspeth (2013, p. 17) state that neuroscientists “now think that all [higher level] cognitive abilities result from the interaction of many processing mechanisms distributed in several regions of the brain. Specific brain regions are not responsible for specific mental faculties” (my emphasis). Higher level/gross functions such as language, perception, affect, thought, movement, and memory “are all made possible by the interlinkage of serial and parallel processing in discrete brain regions, each with specific functions” (Kandel & Hudspeth 2013, p. 17, my emphasis; Bressler 1995; Gazzaniga 1989, p. 947). High-level mental functions fractionate into low-level subfunctions, then, and it is these narrowly defined low-level operating systems that are understood to satisfy the criteria for modularity in neuroscience. The key principle here is that of distributed parallel processing, in which “functional parts . . . interconnect uniquely to form processing networks” (Krubitzer 1995, p. 408; Bressler 1995; Mountcastle 1997, p. 717). Kandel and Hudspeth give a vivid illustration:

Simple introspection suggests that we store each piece of our knowledge as a single representation that can be recalled by memory-jogging stimuli or even by the imagination alone. Everything you know about your grandmother, for example, seems to be stored in one complete representation that is equally accessible whether you see her in person, hear her voice, or simply think about her. Our experience, however, is not a faithful guide to how knowledge is stored in memory. Knowledge about grandmother is not stored as a single representation but rather is subdivided into distinct categories and stored separately. One region of the brain stores information about the invariant physical features that trigger your visual recognition of her. Information about changeable aspects of her face—her expression and lip movements that relate to social communication—is stored in another region. The ability to recognize her voice is mediated in yet another region. (2013, pp. 17—18)

This picture fits flush with the sort of distributed parallel activation evidence that underpins neural reuse (Pasqualotto 2016; Pessoa 2016). Indeed, to the extent that they are not strictly domain-specific, the stable low-level operations that occur as nodes in these distributed systems seem to be the empirical equivalent of the low-level cognitive workings posited in the earliest formulations of the massive redeployment hypothesis.

A little history will clarify the significance of this discovery. The elaboration of the distributed processing model is the high point of an intense research effort within the structuralist tradition. In my earlier discussion, I noted that Carl Wernicke stood out among the ranks of modern neurologists with his distinctive vision of the structure—function relationship. I suggested that he may even have been operating with an implicit understanding of the difference between a cognitive working and a cognitive use (Bergeron 2007). In a famous paper, Wernicke (1908) described a novel kind of aphasia, one in which the patient can produce words but not comprehend them—the precise inverse of the pathology described by Broca earlier that century. The brain lesion responsible for this aphasia was to a distinct cortical region of the left cerebral hemisphere (later called “Wernicke’s area”). Wernicke presented his account of this pathology in terms of an explicit neural model of language processing that attempted to steer a middle course between the two competing frameworks of his day, that of the phrenologists and cellular connectionists on one hand, who contended that specific functions were realized in localized neural tissue (and were therefore guided by the anatomical modularity assumption), and that of the holists on the other, who supposed that every mental function involved the brain as an aggregate (Kandel & Hudspeth 2013). Wernicke’s model had only basic sensory-motor and perceptual functions localized to discrete regions of cortex. Higher functions depended on the cooperation of several neural elements, implying that single behaviors could not be pinned down to specific sites. Wernicke thus became the first neurologist to advance the thoroughly modern notion of distributed processing (Kandel & Hudspeth 2013; Mountcastle 1997). He assigned a specific language motor program governing the mouth movements for speech to the region implicated in Broca’s aphasia, and a sensory program governing word perception to the area implicated in the new aphasia he described.

According to this model, the initial steps in neural processing of spoken or written words occur in separate sensory areas of the cortex specialized for auditory or visual information. This information is then conveyed to a cortical association area, the angular gyrus, specialized for processing both auditory and visual information. Here, according to Wernicke, spoken or written words are transformed into a neural sensory code shared by both speech and writing. This representation is conveyed to Wernicke’s area, where it is recognized as language and associated with meaning. It is also relayed to Broca’s area, which contains the rules, or grammar, for transforming the sensory representation into a motor representation that can be realized as spoken or written language. When this transformation from sensory to motor representation cannot take place, the patient loses the ability to speak and write. (Kandel & Hudspeth 2013, p. 12)

The success of Wernicke’s clinical model in predicting a third type of aphasia—one in which “the receptive and expressive zones for speech are intact, but the neuronal fibers that connect them are destroyed”—as well as its general influence among late—nineteenth-century neurologists, helped inaugurate a new approach to cortical localization spearheaded by the German anatomist Korbinian Brodmann. Brodmann’s revolutionary method of distinguishing cortical regions on the basis of cellular shape and vertical orientation brings us one step closer to the cortical columns that are now taken to be the “fundamental computational modules of the neocortex” (Amaral & Strick 2013, p. 348).

Brodmann’s contribution was to extend the histological and cytoarchitectonic methods of his day by working comparatively; i.e., across species. He showed that neurons in the cerebral cortex have both a layer-wise (laminar) and vertical (columnar) orientation, and used this structure to guide his subdivision of the brain into more functionally discrete regions. Specifically, Brodmann noted differences in the packing densities and shapes of neurons as he bored down into the cortex, as well as differences in laminar thickness and synaptic connections as he traveled horizontally along its surface. This proved to be a decisive step, for we now know that functional differences in cortex depend on the relative thickness of layers as one moves from region to region. Each of its six layers is characterized by different inputs and outputs, with neurons projecting to different parts of the brain. “Projections to other parts of the neocortex, the so-called cortico-cortical or associational connections, arise primarily from neurons in layers II and III. Projections to subcortical regions arise mainly from layers V and VI” (Amaral & Strick 2013, p. 346). “Input” areas such as the primary visual cortex receive sensory information from the thalamus, and therefore have an enlarged layer IV, since this is where axons from the thalamus typically terminate: “The input layer contains a specialized type of excitatory neuron called the stellate cell, which has a dense bushy dendrite that is relatively localized, and seems particularly good at collecting the local axonal input to this layer” (O’Reilly et al. 2012, p. 33). “Hidden” areas, processing neither inputs nor outputs but essential to the formation of abstract category representations, are thickest at layers II and III, since the predominance of pyramidal cells in these layers makes them “well positioned for performing this critical categorization function” (O’Reilly et al. 2012, p. 34). Finally, “output” areas have their thickest layers at layers V and VI, given that the efferent connections that typify output zones must “synapse directly onto [subcortical] muscle control areas,” and it is the neurons in these layers that best meet this requirement (O’Reilly et al. 2012, p. 34). Brodmann marked the boundaries where these surface differences occurred and was thus able to distinguish the 47 distinct brain regions that have since become eponymous. Each of Brodmann’s brain areas consequently relates to a specific cognitive or sensory-motor function: areas 44 and 45, for instance, correspond to Broca’s area, and area 22 corresponds to Wernicke’s area.

This is where modules reenter the story. The sort of cytoarchitectonic methods that Brodmann employed, while delivering a very useful functional subdivision by the standards of his day, were not quite able to do justice to the subtlety of functional variation in the cortex. For the five regions Brodmann designated as being concerned with visual function (areas 17—21), modern electrophysiological and connectional analyses have interposed 35. These take the form of cortical columns that run from the outermost surface of the cortical sheet (the so-called pial surface) to the white matter deep beneath layer VI. A “column” is in effect a very thin cross-sectional slice of the cortical field, no more than a fraction of a millimeter across, such that “[n]eurons within a column tend to have very similar response properties, presumably because they form a local processing network” (Amaral & Strick 2013, p. 348). It is this distinctive columnar structure that passes for the basic cognitive module of neuroscience today (Mountcastle 1997; Zador 2015), and its importance resides, partly at least, in the computational efficiency it confers on neural circuits:

Columnar organization . . . minimizes the distance required for neurons with similar functional properties to communicate with one another and allows them to share inputs from discrete pathways that convey information about particular sensory attributes. This efficient connectivity economizes on the use of brain volume and maximizes processing speed. The clustering of neurons into functional groups, as in the columns of the cortex, allows the brain to minimize the number of neurons required for analyzing different attributes. If all neurons were tuned for every attribute, the resultant combinatorial explosion would require a prohibitive number of neurons. (Gilbert 2013, p. 570)

At least part of the motivation for the brain module, then, is to address concerns pertaining to the scaling problem we encountered in § 3.2 (i.e., as the number of neurons increases, the number of neurons that must be connected grows quadratically larger). It is genuinely modular in the sense of possessing both functional specificity—i.e., a discrete computational operation definable over a preferred (but nonexclusive) set of inputs—and spatial localization (Pascual-Leone & Hamilton 2001, pp. 441, 443; Gazzaniga 1989, p. 947; O’Reilly et al. 2012, pp. 36—40; Pasqualotto 2016; Pessoa 2016).

All this is predominantly (and paradigmatically) true of the sensory-motor cortical maps discussed in Chapter 2. Many of these have “functionally specific, connected neurons to extract behaviorally relevant features [e.g., lines and edges from spatial receptive fields] from incoming sensory information” and “a degree of functional autonomy” (Rowland & Moser 2014, p. 22). Whether this organization is exemplified also by non-sensory/non-motor high-end association cortices has not up until now been clear, but Rowland and Moser (2014) review evidence suggesting that there are definite similarities between sensorimotor columns and the organization found in medial entorhinal cortex (MEC) implicated in episodic and spatial memory tasks. If the grid map of MEC really were to be organized in this modular fashion, it would certainly put paid to the idea of a rigid Cartesian distinction between “central” and “peripheral” cognition as far as modularity is concerned (see § 7.2.2). Of course, the precise degree to which MEC resembles columnar organization is the crucial question. The similarities for their part are clear: MEC has “vertically linked cells, tight bundling of dendrites from the deeper layers, and predominantly local connections raising the possibility that it contains functionally autonomous columns” (Rowland & Moser 2014, p. 22). Moreover, “MEC has well-defined spatial responses that allow the cells to be analyzed for topography and modularity in their response properties” (Rowland & Moser 2014, p. 22). There is one noteworthy difference, however. The majority of entorhinal modules appear to be anatomically intermingled such that, while they remain functionally independent and discrete (dissociable in principle), they are anatomically overlapping and spatially interspersed, rather than strictly localized. Entorhinal modules therefore appear to be merely functional, not anatomical. Their functional specificity is further corroborated by the fact that, although columns are themselves composed of far smaller units called “minicolumns” (consisting of between 80 and 100 neurons), “[n]o research has yet determined the capacity of minicolumns for independent activity outside the macrocolumn that they belong to” (Buxhoeveden & Casanova 2002, p. 937). The upshot of all this is that the brain could be organized into column-based modules of roughly common form throughout, including regions that are important to central cognition.

What needs most emphasizing about the brain module are the very qualities that set it apart from the classical notion that still features unmistakably in discussions of modularity within cognitive science, cognitive neuropsychology, neuropsychology, and the philosophy of mind. Here I am referring to its extremely restricted scope—an exiguously small subfunctional computation—and its dynamic metamodal response properties: the brain module is in essence a domain-general reusable operator appearing within various interacting, nested, and distributed neural assemblies (Mountcastle 1997; Jacobs 1999; Pascual-Leone & Hamilton 2001, Pascual-Leone et al. 2005; Pasqualotto 2016; Pessoa 2016). We saw these dynamic response properties in connection with an earlier discussion revolving around crossmodal plasticity, supramodal organization, and domain specificity (§§ 2.4.2—2.4.3). I shall revisit and elaborate on this material in the next section, when I explain more fully the character and import of Pascual-Leone and Hamilton’s (2001) original metamodal hypothesis of brain organization. It will be relevant both to the issue of the functional specificity of modules (§ 5.1) and to their early development (Chapter 6).

Thus far I have provided an outline of the varieties of modularity, defended what I take to be indispensable in any modular theory of the mind, and foregrounded the neuroscientific notion of modularity. The next chapter pursues head-on the implications of neural reuse for the modularity of mind.