Related papers: Revealing emergent human-like conceptual representations from language prediction

Revealing emergent human-like conceptual representations from language prediction

URL: http://arxiv.org/abs/2501.12547v4
Date: Sat, 08 Nov 2025 09:32:54 GMT
Title: Revealing emergent human-like conceptual representations from language prediction
Authors: Ningyu Xu, Qi Zhang, Chao Du, Qiang Luo, Xipeng Qiu, Xuanjing Huang, Menghan Zhang,
Abstract summary: Large language models (LLMs) trained solely through next-token prediction on text exhibit strikingly human-like behaviors.<n>Are these models developing concepts akin to those of humans?<n>We found that LLMs can flexibly derive concepts from linguistic descriptions in relation to contextual cues about other concepts.
Score: 90.73285317321312
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: People acquire concepts through rich physical and social experiences and use them to understand and navigate the world. In contrast, large language models (LLMs), trained solely through next-token prediction on text, exhibit strikingly human-like behaviors. Are these models developing concepts akin to those of humans? If so, how are such concepts represented, organized, and related to behavior? Here, we address these questions by investigating the representations formed by LLMs during an in-context concept inference task. We found that LLMs can flexibly derive concepts from linguistic descriptions in relation to contextual cues about other concepts. The derived representations converge toward a shared, context-independent structure, and alignment with this structure reliably predicts model performance across various understanding and reasoning tasks. Moreover, the convergent representations effectively capture human behavioral judgments and closely align with neural activity patterns in the human brain, providing evidence for biological plausibility. Together, these findings establish that structured, human-like conceptual representations can emerge purely from language prediction without real-world grounding, highlighting the role of conceptual structure in understanding intelligent behavior. More broadly, our work suggests that LLMs offer a tangible window into the nature of human concepts and lays the groundwork for advancing alignment between artificial and human intelligence.

Related papers

Emergent Structured Representations Support Flexible In-Context Inference in Large Language Models [77.98801218316505]
Large language models (LLMs) exhibit emergent behaviors suggestive of human-like reasoning.<n>We investigate the internal processing of LLMs during in-context concept inference.
arXiv Detail & Related papers (2026-02-08T03:14:39Z)
FaCT: Faithful Concept Traces for Explaining Neural Network Decisions [56.796533084868884]
Deep networks have shown remarkable performance across a wide range of tasks, yet getting a global concept-level understanding of how they function remains a key challenge.<n>We put emphasis on the faithfulness of concept-based explanations and propose a new model with model-inherent mechanistic concept-explanations.<n>Our concepts are shared across classes and, from any layer, their contribution to the logit and their input-visualization can be faithfully traced.
arXiv Detail & Related papers (2025-10-29T13:35:46Z)
CLMN: Concept based Language Models via Neural Symbolic Reasoning [27.255064617527328]
Concept Language Model Network (CLMN) is a neural-symbolic framework that keeps both performance and interpretability.<n> CLMN represents concepts as continuous, human-readable embeddings.<n>Model augments original text features with concept-aware representations and automatically induces interpretable logic rules.
arXiv Detail & Related papers (2025-10-11T06:58:44Z)
From Words to Waves: Analyzing Concept Formation in Speech and Text-Based Foundation Models [20.244145418997377]
We analyze the conceptual structures learned by speech and textual models both individually and jointly.<n>We employ Latent Concept Analysis, an unsupervised method for uncovering latent representations in neural networks, to examine how semantic abstractions form across modalities.
arXiv Detail & Related papers (2025-06-01T19:33:21Z)
From Tokens to Thoughts: How LLMs and Humans Trade Compression for Meaning [52.32745233116143]
Humans organize knowledge into compact categories through semantic compression.<n>Large Language Models (LLMs) demonstrate remarkable linguistic abilities.<n>But whether their internal representations strike a human-like trade-off between compression and semantic fidelity is unclear.
arXiv Detail & Related papers (2025-05-21T16:29:00Z)
Non-literal Understanding of Number Words by Language Models [33.24263583093367]
Humans naturally interpret numbers non-literally, combining context, world knowledge, and speaker intent. We investigate whether large language models (LLMs) interpret numbers similarly, focusing on hyperbole and pragmatic halo effects.
arXiv Detail & Related papers (2025-02-10T07:03:00Z)
Human-like object concept representations emerge naturally in multimodal large language models [24.003766123531545]
We combined behavioral and neuroimaging analysis methods to uncover how the object concept representations in Large Language Models correlate with those of humans. The resulting 66-dimensional embeddings were found to be highly stable and predictive, and exhibited semantic clustering akin to human mental representations. This study advances our understanding of machine intelligence and informs the development of more human-like artificial cognitive systems.
arXiv Detail & Related papers (2024-07-01T08:17:19Z)
ConcEPT: Concept-Enhanced Pre-Training for Language Models [57.778895980999124]
ConcEPT aims to infuse conceptual knowledge into pre-trained language models. It exploits external entity concept prediction to predict the concepts of entities mentioned in the pre-training contexts. Results of experiments show that ConcEPT gains improved conceptual knowledge with concept-enhanced pre-training.
arXiv Detail & Related papers (2024-01-11T05:05:01Z)
Divergences between Language Models and Human Brains [59.100552839650774]
We systematically explore the divergences between human and machine language processing. We identify two domains that LMs do not capture well: social/emotional intelligence and physical commonsense. Our results show that fine-tuning LMs on these domains can improve their alignment with human brain responses.
arXiv Detail & Related papers (2023-11-15T19:02:40Z)
Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation. We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z)
Towards Concept-Aware Large Language Models [56.48016300758356]
Concepts play a pivotal role in various human cognitive functions, including learning, reasoning and communication. There is very little work on endowing machines with the ability to form and reason with concepts. In this work, we analyze how well contemporary large language models (LLMs) capture human concepts and their structure.
arXiv Detail & Related papers (2023-11-03T12:19:22Z)
Interpretability is in the Mind of the Beholder: A Causal Framework for Human-interpretable Representation Learning [22.201878275784246]
Focus in Explainable AI is shifting from explanations defined in terms of low-level elements, such as input features, to explanations encoded in terms of interpretable concepts learned from data. How to reliably acquire such concepts is, however, still fundamentally unclear. We propose a mathematical framework for acquiring interpretable representations suitable for both post-hoc explainers and concept-based neural networks.
arXiv Detail & Related papers (2023-09-14T14:26:20Z)
The Neuro-Symbolic Inverse Planning Engine (NIPE): Modeling Probabilistic Social Inferences from Linguistic Inputs [50.32802502923367]
We study the process of language driving and influencing social reasoning in a probabilistic goal inference domain. We propose a neuro-symbolic model that carries out goal inference from linguistic inputs of agent scenarios. Our model closely matches human response patterns and better predicts human judgements than using an LLM alone.
arXiv Detail & Related papers (2023-06-25T19:38:01Z)
From Word Models to World Models: Translating from Natural Language to the Probabilistic Language of Thought [124.40905824051079]
We propose rational meaning construction, a computational framework for language-informed thinking. We frame linguistic meaning as a context-sensitive mapping from natural language into a probabilistic language of thought. We show that LLMs can generate context-sensitive translations that capture pragmatically-appropriate linguistic meanings. We extend our framework to integrate cognitively-motivated symbolic modules.
arXiv Detail & Related papers (2023-06-22T05:14:00Z)
On the Computation of Meaning, Language Models and Incomprehensible Horrors [0.0]
We integrate foundational theories of meaning with a mathematical formalism of artificial general intelligence (AGI) Our findings shed light on the relationship between meaning and intelligence, and how we can build machines that comprehend and intend meaning.
arXiv Detail & Related papers (2023-04-25T09:41:00Z)
Conceptual structure coheres in human cognition but not in large language models [7.405352374343134]
We show that conceptual structure is robust to differences in culture, language, and method of estimation. Results highlight an important difference between contemporary large language models and human cognition.
arXiv Detail & Related papers (2023-04-05T21:27:01Z)
Probing Neural Language Models for Human Tacit Assumptions [36.63841251126978]
Humans carry stereotypic tacit assumptions (STAs) or propositional beliefs about generic concepts. We construct a diagnostic set of word prediction prompts to evaluate whether recent neural contextualized language models trained on large text corpora capture STAs.
arXiv Detail & Related papers (2020-04-10T01:48:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.