The Vector Grounding Problem
- URL: http://arxiv.org/abs/2304.01481v2
- Date: Thu, 05 Jun 2025 07:55:56 GMT
- Title: The Vector Grounding Problem
- Authors: Dimitri Coelho Mollo, Raphaël Millière,
- Abstract summary: Drawing on philosophical theories of representational content, we argue that LLMs can achieve referential grounding.<n>One potentially surprising implication of our discussion is that multimodality and embodiment are neither necessary nor sufficient to overcome the Grounding Problem.
- Score: 0.047888359248129786
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The remarkable performance of large language models (LLMs) on complex linguistic tasks has sparked debate about their capabilities. Unlike humans, these models learn language solely from textual data without directly interacting with the world. Yet they generate seemingly meaningful text on diverse topics. This achievement has renewed interest in the classical `Symbol Grounding Problem' -- the question of whether the internal representations and outputs of symbolic AI systems can possess intrinsic meaning that is not parasitic on external interpretation. Although modern LLMs compute over vectors rather than symbols, an analogous problem arises for these systems, which we call the Vector Grounding Problem. This paper has two main goals. First, we distinguish five main notions of grounding that are often conflated in the literature, and argue that only one of them, which we call referential grounding, is relevant to the Vector Grounding Problem. Second, drawing on philosophical theories of representational content, we provide two arguments for the claim that LLMs and related systems can achieve referential grounding: (1) through preference fine-tuning methods that explicitly establish world-involving functions, and (2) through pre-training alone, which in limited domains may select for internal states with world-involving content, as mechanistic interpretability research suggests. Through these pathways, LLMs can establish connections to the world sufficient for intrinsic meaning. One potentially surprising implication of our discussion is that that multimodality and embodiment are neither necessary nor sufficient to overcome the Grounding Problem.
Related papers
- On measuring grounding and generalizing grounding problems [0.0]
The symbol grounding problem asks how tokens cat can be about cats, as opposed to mere shapes manipulated in calculus.<n>We recast grounding from a binary judgment into an audit across desiderata, each indexed by an evaluation.<n>We apply this framework to four grounding modes (symbolic; referential; vectorial; relational) and three case studies.
arXiv Detail & Related papers (2025-12-05T22:58:47Z) - Decoupling Understanding from Reasoning via Problem Space Mapping for Small-scale Model Reasoning [22.582715282848795]
We propose a new framework that decouples understanding from reasoning by mapping natural language problems into a canonical problem space.<n>Within this framework, we introduce DURIT, a three-step algorithm that iteratively aligns reasoning trajectories through self-distillation.<n>Experiments show that DURIT substantially improves SLMs' performance on both in-domain and out-of-domain mathematical and logical reasoning tasks.
arXiv Detail & Related papers (2025-08-07T01:13:30Z) - Grounding from an AI and Cognitive Science Lens [4.624355582375099]
This article explores grounding from both cognitive science and machine learning perspectives.
It identifies the subtleties of grounding, its significance for collaborative agents, and similarities and differences in grounding approaches in both communities.
arXiv Detail & Related papers (2024-02-19T17:44:34Z) - Three Pathways to Neurosymbolic Reinforcement Learning with
Interpretable Model and Policy Networks [4.242435932138821]
We study a class of neural networks that build interpretable semantics directly into their architecture.
We reveal and highlight both the potential and the essential difficulties of combining logic, simulation, and learning.
arXiv Detail & Related papers (2024-02-07T23:00:24Z) - Exploring Spatial Schema Intuitions in Large Language and Vision Models [8.944921398608063]
We investigate whether large language models (LLMs) effectively capture implicit human intuitions about building blocks of language.
Surprisingly, correlations between model outputs and human responses emerge, revealing adaptability without a tangible connection to embodied experiences.
This research contributes to a nuanced understanding of the interplay between language, spatial experiences, and computations made by large language models.
arXiv Detail & Related papers (2024-02-01T19:25:50Z) - Grounding for Artificial Intelligence [8.13763396934359]
grounding is the process of connecting the natural language and abstract knowledge to the internal representation of the real world in an intelligent being.
This paper makes an attempt to systematically study this problem.
arXiv Detail & Related papers (2023-12-15T04:45:48Z) - Visually Grounded Language Learning: a review of language games,
datasets, tasks, and models [60.2604624857992]
Many Vision+Language (V+L) tasks have been defined with the aim of creating models that can ground symbols in the visual modality.
In this work, we provide a systematic literature review of several tasks and models proposed in the V+L field.
arXiv Detail & Related papers (2023-12-05T02:17:29Z) - Grounding Gaps in Language Model Generations [67.79817087930678]
We study whether large language models generate text that reflects human grounding.
We find that -- compared to humans -- LLMs generate language with less conversational grounding.
To understand the roots of the identified grounding gap, we examine the role of instruction tuning and preference optimization.
arXiv Detail & Related papers (2023-11-15T17:40:27Z) - Brain in a Vat: On Missing Pieces Towards Artificial General
Intelligence in Large Language Models [83.63242931107638]
We propose four characteristics of generally intelligent agents.
We argue that active engagement with objects in the real world delivers more robust signals for forming conceptual representations.
We conclude by outlining promising future research directions in the field of artificial general intelligence.
arXiv Detail & Related papers (2023-07-07T13:58:16Z) - Kosmos-2: Grounding Multimodal Large Language Models to the World [107.27280175398089]
We introduce Kosmos-2, a Multimodal Large Language Model (MLLM)
It enables new capabilities of perceiving object descriptions (e.g., bounding boxes) and grounding text to the visual world.
Code and pretrained models are available at https://aka.ms/kosmos-2.
arXiv Detail & Related papers (2023-06-26T16:32:47Z) - DiPlomat: A Dialogue Dataset for Situated Pragmatic Reasoning [89.92601337474954]
Pragmatic reasoning plays a pivotal role in deciphering implicit meanings that frequently arise in real-life conversations.
We introduce a novel challenge, DiPlomat, aiming at benchmarking machines' capabilities on pragmatic reasoning and situated conversational understanding.
arXiv Detail & Related papers (2023-06-15T10:41:23Z) - NELLIE: A Neuro-Symbolic Inference Engine for Grounded, Compositional, and Explainable Reasoning [59.16962123636579]
This paper proposes a new take on Prolog-based inference engines.
We replace handcrafted rules with a combination of neural language modeling, guided generation, and semi dense retrieval.
Our implementation, NELLIE, is the first system to demonstrate fully interpretable, end-to-end grounded QA.
arXiv Detail & Related papers (2022-09-16T00:54:44Z) - GreaseLM: Graph REASoning Enhanced Language Models for Question
Answering [159.9645181522436]
GreaseLM is a new model that fuses encoded representations from pretrained LMs and graph neural networks over multiple layers of modality interaction operations.
We show that GreaseLM can more reliably answer questions that require reasoning over both situational constraints and structured knowledge, even outperforming models 8x larger.
arXiv Detail & Related papers (2022-01-21T19:00:05Z) - Emergence of Machine Language: Towards Symbolic Intelligence with Neural
Networks [73.94290462239061]
We propose to combine symbolism and connectionism principles by using neural networks to derive a discrete representation.
By designing an interactive environment and task, we demonstrated that machines could generate a spontaneous, flexible, and semantic language.
arXiv Detail & Related papers (2022-01-14T14:54:58Z) - Provable Limitations of Acquiring Meaning from Ungrounded Form: What
will Future Language Models Understand? [87.20342701232869]
We investigate the abilities of ungrounded systems to acquire meaning.
We study whether assertions enable a system to emulate representations preserving semantic relations like equivalence.
We find that assertions enable semantic emulation if all expressions in the language are referentially transparent.
However, if the language uses non-transparent patterns like variable binding, we show that emulation can become an uncomputable problem.
arXiv Detail & Related papers (2021-04-22T01:00:17Z) - Learning Zero-Shot Multifaceted Visually Grounded Word Embeddingsvia
Multi-Task Training [8.271859911016719]
Language grounding aims at linking the symbolic representation of language (e.g., words) into the rich perceptual knowledge of the outside world.
We argue that this approach sacrifices the abstract knowledge obtained from linguistic co-occurrence statistics in the process of acquiring perceptual information.
arXiv Detail & Related papers (2021-04-15T14:49:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.