The neural and computational bases of semantic cognition - Nature Reviews Neuroscience


The neural and computational bases of semantic cognition - Nature Reviews Neuroscience

This Review examines a decade of research suggesting that semantic cognition relies on two principal interacting neural systems. The first system is one of representation, which encodes knowledge of concepts through the learning of the higher-order relationships among various sensory, motor, linguistic and affective sources of information that are widely distributed in the cortex. Conceptual representations are distilled within this system from lifelong verbal and non-verbal experience1,2,3,4, and serve to promote knowledge generalization across items and contexts5,6,7. The second system is one of control, which manipulates activation within the representational system to generate inferences and behaviours that are appropriate for each specific temporal or task context8,9,10,11,12. We refer to this two-system view as the controlled semantic cognition (CSC) framework. In what follows, we review the converging evidence for each part of the CSC framework and consider how it reconciles long-standing puzzles from studies of both healthy and disordered semantic cognition.

Around a decade ago, we and others proposed the 'hub-and-spoke' theory of semantic representation (Fig. 1), which explained how conceptual knowledge might arise through learning about the statistical structure of our multimodal experiences, and also proposed some neuroanatomical underpinnings for these abilities, accounting for patterns of impairment that are observed in some semantic disorders. The hub-and-spoke theory assimilated two important, existing ideas. First, in keeping with Meynert and Wernicke's classical proposal and contemporary 'embodied' theories (Box 1), the hub-and-spoke model assumed that multimodal verbal and non-verbal experiences provide the core 'ingredients' for constructing concepts and that these information sources are encoded in modality-specific cortices, which are distributed across the brain (the 'spokes'). Second, the model proposed that cross-modal interactions for all modality-specific sources of information are mediated, at least in part, by a single transmodal hub that is situated bilaterally in the anterior temporal lobes (ATLs). This second idea runs counter to some classical hypotheses and to contemporary 'distributed-only' theories of semantic representation, which have assumed that concepts arise through direct connections among modality-specific regions without a common transmodal region.

The ATL-hub view was motivated by both empirical and computational observations. The empirical motivation stemmed from cognitive neuropsychology. It was already known that damage to higher-order association cortices could produce striking transmodal semantic impairments, leading some researchers to propose the existence of multiple cross-modal 'convergence zones', possibly specialized to represent different conceptual domains. However, a detailed study of the striking disorder called semantic dementia (SD) (Supplementary information S1 (figure)) suggested that the ATL transmodal region might be important for all conceptual domains, as individuals with SD show semantic impairments across all modalities and virtually all types of concept (with the exception of simple numerical knowledge). Several additional characteristics of the impairment in SD seem to be compatible only with disruption of a central, transmodal hub in this disorder. Notably, individuals with SD show markedly consistent patterns of deficits across tasks, despite wide variation in the modality of stimulus, response or type of knowledge required. Indeed, the likelihood that patients with SD correctly respond to a given item in a task requiring semantic knowledge can be consistently predicted by a combination of three factors: the familiarity of the item (high familiarity leads to better performance; Supplementary information S1 (figure)), the typicality of the item within a domain (typical items are associated with better performance; Supplementary information S2 (figure)) and the specificity of the knowledge that is required by the task (high specificity leads to worse performance). Unlike some forms of dementia (such as Alzheimer disease) that are associated with widespread pathology in the brain, SD is associated with atrophy and hypometabolism that are centred on the anterior ventral and polar temporal regions bilaterally (Supplementary information S1 (figure)), suggesting that these regions serve as the transmodal domain-general conceptual hub.

Computationally, the hub-and-spoke hypothesis provided a solution to the challenges of building coherent, generalizable concepts that have been highlighted in philosophy and cognitive science (for a more detailed discussion, see Refs 5,10,33). One challenge is that the information relevant to a given concept is experienced across different verbal and sensory modalities, contexts and time points. Another challenge is that conceptual structure is not transparently reflected in the sensory, motor or linguistic structure of the environment -- instead, the relationship between conceptual structure and modality-specific features is complex, variable and nonlinear. It is difficult to see how these challenges could be met by a system that simply encodes direct associations among the modality-specific information sources, but they can be solved by neural network models that adopt an intermediating hub for all concepts and modalities.

Various brain regions have long been a target of research in semantics (Box 2), but the ATL received little prior attention. Indeed, although individuals with SD were reported more than a century ago, the link between semantic impairment and ATL damage only became apparent with modern neuroimaging techniques. Classical language models were based on patients with middle cerebral artery stroke, which is unlikely to damage the middle to ventral ATL (and bilaterally). Likewise, a bias has existed in functional MRI (fMRI) studies that, owing to various methodological issues, has led to consistent undersampling of activation in the middle and inferior ATL. Since the initial ATL-hub proposal, the role of this region in semantic processing has been extensively studied using various methodologies. Together, this work corroborates and extends several predictions of the hub-and-spoke model and clarifies the anatomical organization and functioning of the ATL region.

The cross-modal hub is centred on the ventrolateral ATL. Key postulates of the original hub-and-spoke model have been validated using various methods (Supplementary information S1 (figure) and Supplementary information S3 (figure)). The ATLs are engaged in semantic processing irrespective of input modality (for example, words, objects, pictures or sounds) and conceptual categories. Although the hub is more strongly engaged for more-specific concepts (for example, Pekinese), it also supports basic (for example, dog) and domain-level (for example, animal) distinctions. Both left and right ATLs are implicated in verbal and non-verbal semantic processing (Box 3). ATL function is semantically selective insofar as these regions are not engaged in equally demanding non-semantic tasks.

These methods also provide important information that cannot be extracted from SD studies alone. Indeed, distortion-corrected fMRI in healthy individuals, cortical grid-electrode stimulation and electrocorticography in neurosurgical patients, and F-fluorodeoxyglucose positron emission tomography in patients with SD (Fig. 2) all indicate that the ventral-ventrolateral ATL is the cross-modal centre-point of the hub for multimodal naming and comprehension. Moreover, as predicted by the hub-and-spoke model, multivoxel pattern analyses of fMRI and electrocorticography data have shown semantic coding and representational merging of modality-specific information sources in the same area (Supplementary information S4 (figure)). Furthermore, in the ventral ATL, detailed semantic information is activated from 250 ms post stimulus onset (Supplementary information S4 (figure)), whereas coarse, domain-level distinctions may be available earlier (∼120 ms post stimulus onset). Inhibitory transcranial magnetic stimulation (TMS) of the lateral ATL produces domain-general semantic slowing, whereas TMS of 'spoke' regions produces a category-sensitive effect (Supplementary information S5 (figure)) -- confirming the importance of both hub and spokes in semantic representation. In healthy participants, ATL regions exhibit intrinsic connectivity (as detected by resting-state fMRI) with modality-specific brain areas, and, in SD patients, the level of comprehension accuracy reflects both the degree of ATL atrophy and the extent of reduction in hub-spoke functional connectivity. This body of work suggests that the cross-modal hub is centred on the ventrolateral ATL and also corroborates core predictions of the hub-and-spoke view: namely, that this region has an important, predicted role in coordinating the communication among modality-specific 'spokes' and that, in so doing, it encodes semantic similarity structure among items.

The broader ATL is graded in its function. The original hub-and-spoke model said little about different ATL subregions, partly because the distribution of atrophy in SD is extremely consistent (being maximal in polar and ventral ATL regions) (Fig. 2C). Likewise, there is little variation in patients' multimodal semantic impairments, apart from small effects that are linked to whether atrophy is more severe in the left or right ATL early in the course of the disease (Box 3). New evidence indicates not only that the ventrolateral ATL is the centre-point of the hub (as reviewed above) but also that the function varies in a graded manner across the ATL subregions (Fig. 2A,B).

The first clue for graded functional variation comes from cytoarchitecture. Brodmann divided the anterior temporal region into several different areas, and modern neuroanatomical techniques have generated finer differentiations. However, Brodmann also noted that cytoarchitectonic changes in the temporal cortex were graded: "to avoid erroneous interpretations it should again be stated that not all these regions are demarcated from each other by sharp borders but may undergo gradual transitions as, for example, in the temporal and parietal regions" (Ref. 56). This observation is replicated in contemporary cytoarchitectonic investigations, which indicate potentially graded patterns of functional differentiation across the ATL region.

The second insight arises from structural and functional connectivity. Consistent with the hub-and-spoke model, major white-matter fasciculi in both human and non-human primates converge in ATL regions; however, their points of termination are only partially overlapping, leading to graded partial differentiations in gross connectivity across ATL subregions. For instance, the uncinate fasciculus connects the orbitofrontal cortex and pars orbitalis most strongly to the temporopolar cortex; other prefrontal connections through the extreme capsule complex preferentially terminate in superior ATL regions, as does the middle longitudinal fasciculus from the inferior parietal lobule; and the inferior longitudinal fasciculus connects most strongly to the ventral and ventromedial ATL. The effects of these partially overlapping fasciculus terminations are made more graded through the strong local U-fibre connections in the ATL. A similar pattern of partially overlapping connectivity has also been observed in resting-state and task-active fMRI studies: in addition to strong intra-ATL connectivity, the temporopolar cortex shows greatest functional connectivity to orbitofrontal areas; the inferolateral ATL exhibits most connectivity to frontal and posterior regions that are associated with controlled semantic processing; and the superior ATL connects most strongly to primary auditory and premotor regions.

Third, data from recent neuroimaging results (which have addressed methodological issues related to successful imaging of semantic tasks in the ATL region) are highly consistent with a graded connectivity-driven model of ATL function (Fig. 2A). As noted above, the ventrolateral ATL activates strongly in semantic tasks irrespective of input modality or stimulus category. Moving away from this centre-point, the cross-modal semantic function of the ATL becomes weaker and is more tied to a specific input modality (Fig. 2B). Thus, more medial ATL regions show greater responsiveness to picture-based materials and concrete concepts than to other types of material. The anterior superior temporal sulcus (STS)-superior temporal gyrus (STG) exhibits the opposite pattern, with greater activation for auditory stimuli, spoken words and abstract concepts, and an overlapping region of the STG has been implicated in combinatorial semantic processes. Last, polar and dorsal ATL areas have shown preferential activity for social over other kinds of concept.

One possible explanation for these graded functional variations is that multiple mutually exclusive ATL subregions are dedicated to different categories or representational modalities. However, there are two problems with this view. First, it is not consistent with the cytoarchitectonic, connectivity and functional data, all of which suggest that the ATL exhibits graded functional specialization rather than discrete functional regions. Second, such an account does not explain the role of the hub, which seems to support knowledge across virtually all domains and modalities. An alternative view is that the ATL hub exhibits graded functional specialization (Fig. 2), with the responsivity of different subregions reflecting graded differences in their connectivity to the rest of the network. On this view, the neuroimaging findings that were noted above reflect the fact that neighbouring ATL regions contribute somewhat more or less to the representation of different kinds of information, depending on the strength of their interactions with various modality-specific representational systems.

Such graded functional specialization arises directly from the influence of connectivity on function. In a close variant of the hub-and-spoke model, Plaut introduced distance-dependent connection strengths to the modality-specific spokes. The importance of each processing unit in the model to a given function depended on its connectivity strength to the spokes. Central hub units furthest from all inputs contributed equally to all semantic tasks; units that were anatomically closer to a given modality-specific spoke took part in all types of semantic processing but contributed somewhat more to tasks involving the proximal modality. For instance, hub units situated near to visual representations would contribute more to tasks like picture naming but less to non-visual tasks (for example, naming items in response to their characteristic sound). The graded hub hypothesis extends this proposal by assuming that ATL functionality is shaped by the long-range cortical connectivity (Fig. 2A). Thus, the medial ATL responds more to visual or concrete concepts by virtue of having greater connectivity to visual than to auditory or linguistic systems; the anterior STS-STG contributes more to abstract concepts and verbal semantic processing by virtue of its greater connectivity to language than to visual systems; and the temporal pole contributes somewhat more to social concepts by virtue of its connectivity to networks that support social cognition and affect. The ventrolateral ATL remains important for all domains because it connects equally to these different systems.

We note here that this type of graded function is not unique to the ATL-hub region or semantic processing. Indeed, other cortical regions and types of processing (for example, the visual and auditory processing streams) also demonstrate graded functional profiles, which follow the underlying patterns of connectivity. Connectivity-induced graded functions may therefore be a general principle, and information arriving at the ATL hub has already been partially processed in these graded non-ATL regions and through the interaction between the ATL and modality-specific regions.

Theories of semantic representation and its neural basis have been strongly influenced by two sets of neuropsychological and functional neuroimaging data, leading to two different theoretical positions. One literature has focused on the general semantic impairment that is observed in some types of brain disease, demonstrating largely equivalent disruption across types of knowledge. Such data support proposals -- including the hub-and-spoke model -- that the cortical semantic system is widely distributed and interactive but needs a transmodal component to capture coherent, generalizable concepts. The second literature focuses on 'category-specific' variations in performance in which different categories of knowledge can be differentially disrupted in neurological disorders or yield differential activation in specific healthy brain regions. Perhaps the most commonly studied, although by no means the sole, contrast is between living things versus man-made items. Such evidence has been used to argue that anatomically distinct and functionally independent neural systems have evolved to support knowledge about different conceptual domains (for example, animals, tools, faces and scenes).

Recent empirical and computational investigations have enhanced the hub-and-spoke framework into a unified theory that may account for both sets of data. In the neuropsychological literature, several large case-series investigations provide contrastive patterns of semantic impairment and clear information about critical neural regions. For example, patients with SD with bilateral ATL atrophy have generalized semantic impairment and largely similar performance levels across different categories of knowledge (once other important performance factors, especially stimulus familiarity and typicality, are controlled). By contrast, patients with posterior ventral occipito-temporal lesions can present with relatively poor identification of natural kinds, and patients with anteromedially centred temporal-lobe damage following an acute period of herpes simplex virus encephalitis (HSVE) show strikingly poorer knowledge of natural kinds than of man-made items. Last, patients with temporoparietal damage show greatest deficits for praxis-related man-made items. These contrastive behavioural-anatomical associations for general versus category-specific semantic impairments find counterparts in convergent evidence from other techniques, including functional neuroimaging and inhibitory TMS in healthy participants and cortical electrode studies of neurosurgical patients.

All these findings can be captured by the connectivity-constrained version of the hub-and-spoke model. The first key notion, which was already expressed but is worth reiterating, is that semantic representations are not just hub based but reflect collaborations between hub and spokes (Supplementary information S5 (figure)). The second is that, consistent with embodied semantic models, modality-specific information (for example, praxis) will be differentially important for some categories (for example, tools). It follows that the progressive degradation of the ATL transmodal hub in patients with SD will generate a category-general pattern, whereas selective damage to spokes can lead to category-specific deficits. Thus, impaired praxis or functional knowledge is deleterious for manipulable man-made items, whereas reduced high-acuity visual input is particularly challenging for differentiating between animals given their shared visual contours. The differential contributions of the hub versus spokes in semantic representation have been demonstrated using TMS in neurologically intact individuals. Indeed, a study showed that such individuals exhibit a category-general effect following lateral ATL stimulation but a category-specific pattern, with slower naming of man-made objects, when the praxis-coding parietal region was directly stimulated. The connectivity-constrained hub-and-spoke model also offers insights into other empirical observations that were noted above. For example, the medial ventral occipito-temporal region exhibits greater activation for man-made items, in part because it is directly connected to the parietal praxis-coding regions; and an explanation in these terms accounts for the evidence that congenitally blind participants show greater activation for man-made items than for animate things in this 'visual' region.

A remaining challenge is to explain the difference between semantic impairment in HSVE and SD. Despite highly overlapping areas of ATL damage in these conditions (albeit damage is more medially focused in HSVE), individuals with HSVE commonly show better knowledge for man-made artefacts than for natural-kind concepts; this finding is rarely observed in individuals with SD. However, a crucial factor in this particular category effect has been acknowledged in one form or another by virtually all researchers who have studied it. Recall that concepts can be categorised at superordinate (for example, animal or tool), basic (for example, dog or knife) or specific (for example, poodle or bread knife) levels. Most semantic research has focused on the basic level, and, at this conceptually salient level, animate or natural-kind concepts tend to be visually and conceptually more similar to one another, and hence more confusable, than man-made things. It is therefore an extremely important explanatory clue that the artefact versus animate performance difference in HSVE holds for the basic level but is eliminated at the subordinate level, at which cases with HSVE are equally and severely impaired for both categories. The obvious interpretation, although this requires more empirical testing, is that the medial temporal lobe region that is typically damaged by the herpes virus is crucial not for distinguishing between living things but for distinguishing between visually or semantically confusable things, which include different types of knife and different breeds of dog. This possibility is compatible with the graded hub-and-spoke hypothesis and the existing evidence of graded, connectivity-driven differential contributions to representation of abstract and concrete concepts across ATL subregions (Fig. 2B), with a preference for concrete items in the medial ATL.

One further factor meriting mention is the fact that SD is a neurodegenerative disease, yielding steady degradation of the ATL and, consequently, of conceptual knowledge. Although patients with SD continue to be surrounded by the multimodal experiences that continuously reinforce and extend conceptual knowledge in a healthy brain, the slow-but-constant deterioration of semantic knowledge in SD is largely incompatible with relearning. By contrast, successfully treated HSVE is an acute illness that is followed by some degree of recovery and relearning. These differences can be mimicked in the hub-and-spoke computational model by comparing progressive degradation against en masse hub damage followed by a period of retraining: the former generates a category-general effect, whereas the latter results in better performance on man-made than on animate concepts. The latter outcome arises because, with reduced representational resources, the model struggles to recapture sufficient 'semantic acuity' to differentiate between the conceptually tightly-packed animate items and subordinate exemplars.

Previous articleNext article

POPULAR CATEGORY

corporate

12840

entertainment

15932

research

7502

misc

16335

wellness

12828

athletics

16768