What Does it Mean for a Neural Network to Learn a "World Model"?
- URL: http://arxiv.org/abs/2507.21513v1
- Date: Tue, 29 Jul 2025 05:30:57 GMT
- Title: What Does it Mean for a Neural Network to Learn a "World Model"?
- Authors: Kenneth Li, Fernanda ViƩgas, Martin Wattenberg,
- Abstract summary: We propose a set of precise criteria for saying a neural net learns and uses a "world model"<n>The goal is to give an operational meaning to terms that are often used informally.<n>An essential addition to the definition is a set of conditions to check that such a "world model" is not a trivial consequence of the neural net's data or task.
- Score: 48.16769678219204
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a set of precise criteria for saying a neural net learns and uses a "world model." The goal is to give an operational meaning to terms that are often used informally, in order to provide a common language for experimental investigation. We focus specifically on the idea of representing a latent "state space" of the world, leaving modeling the effect of actions to future work. Our definition is based on ideas from the linear probing literature, and formalizes the notion of a computation that factors through a representation of the data generation process. An essential addition to the definition is a set of conditions to check that such a "world model" is not a trivial consequence of the neural net's data or task.
Related papers
- On the Performance of Concept Probing: The Influence of the Data (Extended Version) [3.2443914909457594]
Concept probing works by training additional classifiers to map the internal representations of a model into human-defined concepts of interest.<n>Research on concept probing has mainly focused on the model being probed or the probing model itself.<n>In this paper, we investigate the effect of the data used to train probing models on their performance.
arXiv Detail & Related papers (2025-07-24T16:18:46Z) - Evaluating the World Model Implicit in a Generative Model [7.317896355747284]
Recent work suggests that large language models may implicitly learn world models.
This includes problems as diverse as simple logical reasoning, geographic navigation, game-playing, and chemistry.
We propose new evaluation metrics for world model recovery inspired by the classic Myhill-Nerode theorem from language theory.
arXiv Detail & Related papers (2024-06-06T02:20:31Z) - Visually Grounded Language Learning: a review of language games,
datasets, tasks, and models [60.2604624857992]
Many Vision+Language (V+L) tasks have been defined with the aim of creating models that can ground symbols in the visual modality.
In this work, we provide a systematic literature review of several tasks and models proposed in the V+L field.
arXiv Detail & Related papers (2023-12-05T02:17:29Z) - A Geometric Notion of Causal Probing [85.49839090913515]
The linear subspace hypothesis states that, in a language model's representation space, all information about a concept such as verbal number is encoded in a linear subspace.<n>We give a set of intrinsic criteria which characterize an ideal linear concept subspace.<n>We find that, for at least one concept across two languages models, the concept subspace can be used to manipulate the concept value of the generated word with precision.
arXiv Detail & Related papers (2023-07-27T17:57:57Z) - State space models can express n-gram languages [51.823427608117626]
We build state space language models that can solve the next-word prediction task for languages generated from n-gram rules.<n>Our proof shows how SSMs can encode n-gram rules using new theoretical results on their capacity.<n>We conduct experiments with a small dataset generated from n-gram rules to show how our framework can be applied to SSMs and RNNs obtained through gradient-based optimization.
arXiv Detail & Related papers (2023-06-20T10:41:23Z) - Grounded Decoding: Guiding Text Generation with Grounded Models for
Embodied Agents [111.15288256221764]
Grounded-decoding project aims to solve complex, long-horizon tasks in a robotic setting by leveraging the knowledge of both models.
We frame this as a problem similar to probabilistic filtering: decode a sequence that both has high probability under the language model and high probability under a set of grounded model objectives.
We demonstrate how such grounded models can be obtained across three simulation and real-world domains, and that the proposed decoding strategy is able to solve complex, long-horizon tasks in a robotic setting by leveraging the knowledge of both models.
arXiv Detail & Related papers (2023-03-01T22:58:50Z) - On Reality and the Limits of Language Data: Aligning LLMs with Human
Norms [10.02997544238235]
Large Language Models (LLMs) harness linguistic associations in vast natural language data for practical applications.
We explore this question using a novel and tightly controlled reasoning test (ART) and compare human norms against versions of GPT-3.
Our findings highlight the categories of common-sense relations models that could learn directly from data and areas of weakness.
arXiv Detail & Related papers (2022-08-25T10:21:23Z) - Pretraining on Interactions for Learning Grounded Affordance
Representations [22.290431852705662]
We train a neural network to predict objects' trajectories in a simulated interaction.
We show that our network's latent representations differentiate between both observed and unobserved affordances.
Our results suggest a way in which modern deep learning approaches to grounded language learning can be integrated with traditional formal semantic notions of lexical representations.
arXiv Detail & Related papers (2022-07-05T19:19:53Z) - Reasoning-Modulated Representations [85.08205744191078]
We study a common setting where our task is not purely opaque.
Our approach paves the way for a new class of data-efficient representation learning.
arXiv Detail & Related papers (2021-07-19T13:57:13Z) - Implicit Representations of Meaning in Neural Language Models [31.71898809435222]
We identify contextual word representations that function as models of entities and situations as they evolve throughout a discourse.
Our results indicate that prediction in pretrained neural language models is supported, at least in part, by dynamic representations of meaning and implicit simulation of entity state.
arXiv Detail & Related papers (2021-06-01T19:23:20Z) - Explainable Deep Classification Models for Domain Generalization [94.43131722655617]
Explanations are defined as regions of visual evidence upon which a deep classification network makes a decision.
Our training strategy enforces a periodic saliency-based feedback to encourage the model to focus on the image regions that directly correspond to the ground-truth object.
arXiv Detail & Related papers (2020-03-13T22:22:15Z) - Semantic Holism and Word Representations in Artificial Neural Networks [0.0]
We show that word representations from the Skip-gram variant of the word2vec model exhibit interesting semantic properties.
This is usually explained by referring to the general distributional hypothesis.
We propose a more specific approach based on Frege's holistic and functional approach to meaning.
arXiv Detail & Related papers (2020-03-11T21:04:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.