Related papers: The Representational Alignment between Humans and Language Models is implicitly driven by a Concreteness Effect

The Representational Alignment between Humans and Language Models is implicitly driven by a Concreteness Effect

URL: http://arxiv.org/abs/2505.15682v1
Date: Wed, 21 May 2025 15:57:58 GMT
Title: The Representational Alignment between Humans and Language Models is implicitly driven by a Concreteness Effect
Authors: Cosimo Iaia, Bhavin Choksi, Emily Wiebers, Gemma Roig, Christian J. Fiebach,
Abstract summary: We estimate semantic distances implicitly used by humans, for a set of carefully selected abstract and concrete nouns.<n>We find that the implicit representational space of participants and the semantic representations of language models are significantly aligned.<n>Results indicate that humans and language models converge on the concreteness dimension, but not on other dimensions.
Score: 4.491391835956324
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The nouns of our language refer to either concrete entities (like a table) or abstract concepts (like justice or love), and cognitive psychology has established that concreteness influences how words are processed. Accordingly, understanding how concreteness is represented in our mind and brain is a central question in psychology, neuroscience, and computational linguistics. While the advent of powerful language models has allowed for quantitative inquiries into the nature of semantic representations, it remains largely underexplored how they represent concreteness. Here, we used behavioral judgments to estimate semantic distances implicitly used by humans, for a set of carefully selected abstract and concrete nouns. Using Representational Similarity Analysis, we find that the implicit representational space of participants and the semantic representations of language models are significantly aligned. We also find that both representational spaces are implicitly aligned to an explicit representation of concreteness, which was obtained from our participants using an additional concreteness rating task. Importantly, using ablation experiments, we demonstrate that the human-to-model alignment is substantially driven by concreteness, but not by other important word characteristics established in psycholinguistics. These results indicate that humans and language models converge on the concreteness dimension, but not on other dimensions.

Related papers

Human-like conceptual representations emerge from language prediction [72.5875173689788]
Large language models (LLMs) trained exclusively through next-token prediction over language data exhibit remarkably human-like behaviors.<n>Are these models developing concepts akin to humans, and if so, how are such concepts represented and organized?<n>Our results demonstrate that LLMs can flexibly derive concepts from linguistic descriptions in relation to contextual cues about other concepts.<n>These findings establish that structured, human-like conceptual representations can naturally emerge from language prediction without real-world grounding.
arXiv Detail & Related papers (2025-01-21T23:54:17Z)
A Grounded Typology of Word Classes [7.201565960962933]
Inspired by information theory, we define "groundedness", an empirical measure of semantic contentfulness.<n>Our measure captures the contentfulness asymmetry between functional (grammatical) and lexical (content) classes across languages.<n>We release a dataset of groundedness scores for 30 languages.
arXiv Detail & Related papers (2024-12-13T18:58:48Z)
Evaluating Contextualized Representations of (Spanish) Ambiguous Words: A New Lexical Resource and Empirical Analysis [2.2530496464901106]
We evaluate semantic representations of Spanish ambiguous nouns in context in a suite of Spanish-language monolingual and multilingual BERT-based models.<n>We find that various BERT-based LMs' contextualized semantic representations capture some variance in human judgments but fall short of the human benchmark.
arXiv Detail & Related papers (2024-06-20T18:58:11Z)
Identifying and interpreting non-aligned human conceptual representations using language modeling [0.0]
We show that congenital blindness induces conceptual reorganization in both a-modal and sensory-related verbal domains. We find that blind individuals more strongly associate social and cognitive meanings to verbs related to motion. For some verbs, representations of blind and sighted are highly similar.
arXiv Detail & Related papers (2024-03-10T13:02:27Z)
Agentivit\`a e telicit\`a in GilBERTo: implicazioni cognitive [77.71680953280436]
The goal of this study is to investigate whether a Transformer-based neural language model infers lexical semantics. The semantic properties considered are telicity (also combined with definiteness) and agentivity.
arXiv Detail & Related papers (2023-07-06T10:52:22Z)
Natural Language Decompositions of Implicit Content Enable Better Text Representations [52.992875653864076]
We introduce a method for the analysis of text that takes implicitly communicated content explicitly into account.<n>We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed.<n>Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP.
arXiv Detail & Related papers (2023-05-23T23:45:20Z)
Transparency Helps Reveal When Language Models Learn Meaning [71.96920839263457]
Our systematic experiments with synthetic data reveal that, with languages where all expressions have context-independent denotations, both autoregressive and masked language models learn to emulate semantic relations between expressions. Turning to natural language, our experiments with a specific phenomenon -- referential opacity -- add to the growing body of evidence that current language models do not well-represent natural language semantics.
arXiv Detail & Related papers (2022-10-14T02:35:19Z)
Visual Superordinate Abstraction for Robust Concept Learning [80.15940996821541]
Concept learning constructs visual representations that are connected to linguistic semantics. We ascribe the bottleneck to a failure of exploring the intrinsic semantic hierarchy of visual concepts. We propose a visual superordinate abstraction framework for explicitly modeling semantic-aware visual subspaces.
arXiv Detail & Related papers (2022-05-28T14:27:38Z)
What Drives the Use of Metaphorical Language? Negative Insights from Abstractness, Affect, Discourse Coherence and Contextualized Word Representations [13.622570558506265]
Given a specific discourse, which discourse properties trigger the use of metaphorical language, rather than using literal alternatives? Many NLP approaches to metaphorical language rely on cognitive and (psycho-)linguistic insights and have successfully defined models of discourse coherence, abstractness and affect. In this work, we build five simple models relying on established cognitive and linguistic properties to predict the use of a metaphorical vs. synonymous literal expression in context.
arXiv Detail & Related papers (2022-05-23T08:08:53Z)
Probing Contextual Language Models for Common Ground with Visual Representations [76.05769268286038]
We design a probing model that evaluates how effective are text-only representations in distinguishing between matching and non-matching visual representations. Our findings show that language representations alone provide a strong signal for retrieving image patches from the correct object categories. Visually grounded language models slightly outperform text-only language models in instance retrieval, but greatly under-perform humans.
arXiv Detail & Related papers (2020-05-01T21:28:28Z)
The Fluidity of Concept Representations in Human Brain Signals [0.0]
We analyze the discriminability of concrete and abstract concepts in fMRI data. We argue that fluid concept representations lead to more realistic models of human language processing.
arXiv Detail & Related papers (2020-02-20T17:31:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.