New preprint: "Sensorimotor distance: A fully grounded measure of semantic similarity for 800 million concept pairs"

Together with my colleague and lab PILouise Connell, I have developed a new measure of semantic distance between concepts. It is based on the senses and body parts involved in experiencing those concepts — in other words it is fully grounded in sensorimotor experience. This sets it aside from other measures of semantic distance, such as those based on distributions of words in language, on encyclopaedic databases, or on lists of properties or features. It also is fairly comprehensive (thanks to the expansive norms collected by colleagues), with distances available for nearly 800,000,000 pairs of concepts.

Two panels. Both show arrangement of dots labelled with concepts. Left panel: select nouns for tools, emotions, fruit and celestial objects. Right panel: select verbs for leg, hand, mouth and cognitive actions
Left panel: select nouns for tools, emotions, fruit and celestial objects. Right panel: select verbs for leg, hand, mouth and cognitive actions. Within each panel, positions are based on sensorimotor distances between concepts, transformed into two dimensions using Sammon mapping.

The measure is described in a new preprint, and you can search, visualise and play around with the distances (e.g. the above image) using an online app I also developed.

Let us know if you do anything cool with it!

New preprint: "Linguistic Distributional Knowledge and Sensorimotor Grounding both Contribute to Semantic Category Production"

My colleagues Briony Banks, Louise Connell and I recently submitted a paper reporting research we've been doing at Lancaster University over the last year.

Needless to say, the Covid-19 lockdowns in the UK have been a substantial impediment to this work, so it's really good to see it finally complete.

A figure taken from the paper preprint. The computational model has two components, "linguistic" and "sensorimotor". The linguistic component is illustrated by colour spreading through a network of connected concepts ("animal", "husbandry", "horse", "cow", etc.). The sensorimotor component is illustrated with bubbles of colour growing and popping, creating new circles as they meet new points in the space ("animal", "cat", "rain", etc.). In the centre, the list of all concepts reached in either component are listed.
Schematic illustration of the computational model operating for an example category.

New preprint: "Understanding the role of linguistic distributional knowledge in cognition"

I have recently submitted a paper based on some work I have been doing at my job at the Embodied Cognition Lab at Lancaster University. In it, we look at a large set of linguistic distributional models commonly used in cognitive psychology, evaluating each on a benchmark behavioural dataset.

Linguistic distributional models are computer models of knowledge, which learn representations of words and their associations from statistical regularities in huge collections of natural language text, such as databases of TV subtitles. The idea is that, just like people, these algorithms can learn something about the meanings of words by only observing how they are used, rather than through direct experience of their referents. To the degree that they do, they can then be used to model the kind of knowledge which people could gain in the same way. These models can be made to perform various tasks which rely on language, or predict how humans will perform these tasks under experimental conditions, and in this way we can evaluate them as models of human semantic memory.

We show, perhaps unsurprisingly*, that different kinds of models are better or worse at capturing different aspects of human semantic processes.

A preprint of the report is available on Psyarxiv.

*unsurprising to you as you read this, perhaps, but actually this is the largest systematic comparison of models as-yet undertaken, and thereby the first to actually effectively weigh the evidence on this question.