Related papers: Evaluating the World Model Implicit in a Generative Model

Evaluating the World Model Implicit in a Generative Model

URL: http://arxiv.org/abs/2406.03689v3
Date: Sun, 10 Nov 2024 23:47:33 GMT
Title: Evaluating the World Model Implicit in a Generative Model
Authors: Keyon Vafa, Justin Y. Chen, Ashesh Rambachan, Jon Kleinberg, Sendhil Mullainathan,
Abstract summary: Recent work suggests that large language models may implicitly learn world models. This includes problems as diverse as simple logical reasoning, geographic navigation, game-playing, and chemistry. We propose new evaluation metrics for world model recovery inspired by the classic Myhill-Nerode theorem from language theory.
Score: 7.317896355747284
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent work suggests that large language models may implicitly learn world models. How should we assess this possibility? We formalize this question for the case where the underlying reality is governed by a deterministic finite automaton. This includes problems as diverse as simple logical reasoning, geographic navigation, game-playing, and chemistry. We propose new evaluation metrics for world model recovery inspired by the classic Myhill-Nerode theorem from language theory. We illustrate their utility in three domains: game playing, logic puzzles, and navigation. In all domains, the generative models we consider do well on existing diagnostics for assessing world models, but our evaluation metrics reveal their world models to be far less coherent than they appear. Such incoherence creates fragility: using a generative model to solve related but subtly different tasks can lead to failures. Building generative models that meaningfully capture the underlying logic of the domains they model would be immensely valuable; our results suggest new ways to assess how close a given model is to that goal.

Related papers

What Does it Mean for a Neural Network to Learn a "World Model"? [48.16769678219204]
We propose a set of precise criteria for saying a neural net learns and uses a "world model"<n>The goal is to give an operational meaning to terms that are often used informally.<n>An essential addition to the definition is a set of conditions to check that such a "world model" is not a trivial consequence of the neural net's data or task.
arXiv Detail & Related papers (2025-07-29T05:30:57Z)
Modeling Open-World Cognition as On-Demand Synthesis of Probabilistic Models [93.1043186636177]
We explore the hypothesis that people use a combination of distributed and symbolic representations to construct bespoke mental models tailored to novel situations.<n>We propose a computational implementation of this idea -- a Model Synthesis Architecture''<n>We evaluate our MSA as a model of human judgments on a novel reasoning dataset.
arXiv Detail & Related papers (2025-07-16T18:01:03Z)
What Has a Foundation Model Found? Using Inductive Bias to Probe for World Models [3.394160022376002]
We develop a technique for evaluating foundation models that examines how they adapt to synthetic datasets.<n>We find that foundation models can excel at their training tasks yet fail to develop inductive biases towards the underlying world model when adapted to new tasks.
arXiv Detail & Related papers (2025-07-09T15:36:15Z)
AI in a vat: Fundamental limits of efficient world modelling for agent sandboxing and interpretability [84.52205243353761]
Recent work proposes using world models to generate controlled virtual environments in which AI agents can be tested before deployment. We investigate ways of simplifying world models that remain agnostic to the AI agent under evaluation.
arXiv Detail & Related papers (2025-04-06T20:35:44Z)
Explore Theory of Mind: Program-guided adversarial data generation for theory of mind reasoning [88.68573198200698]
We introduce ExploreToM, the first framework to allow large-scale generation of diverse and challenging theory of mind data. Our approach leverages an A* search over a custom domain-specific language to produce complex story structures and novel, diverse, yet plausible scenarios. Our evaluation reveals that state-of-the-art LLMs, such as Llama-3.1-70B and GPT-4o, show accuracies as low as 0% and 9% on ExploreToM-generated data.
arXiv Detail & Related papers (2024-12-12T21:29:00Z)
Meta-Models: An Architecture for Decoding LLM Behaviors Through Interpreted Embeddings and Natural Language [0.0]
We use a "meta-model" that takes activations from an "input-model" and answers natural language questions about the input-model's behaviors. We evaluate the meta-model's ability to generalize by training them on selected task types and assessing their out-of-distribution performance in deceptive scenarios.
arXiv Detail & Related papers (2024-10-03T13:25:15Z)
Promises and Pitfalls of Generative Masked Language Modeling: Theoretical Framework and Practical Guidelines [74.42485647685272]
We focus on Generative Masked Language Models (GMLMs) We train a model to fit conditional probabilities of the data distribution via masking, which are subsequently used as inputs to a Markov Chain to draw samples from the model. We adapt the T5 model for iteratively-refined parallel decoding, achieving 2-3x speedup in machine translation with minimal sacrifice in quality.
arXiv Detail & Related papers (2024-07-22T18:00:00Z)
Elements of World Knowledge (EWOK): A cognition-inspired framework for evaluating basic world knowledge in language models [42.48862540545121]
We present Elements of World Knowledge (EWOK), a framework for evaluating world modeling in language models. EWOK targets specific concepts from multiple knowledge domains known to be vital for world modeling in humans. We then introduce EWOK-CORE-1.0, a dataset of 4,374 items covering 11 world knowledge domains.
arXiv Detail & Related papers (2024-05-15T17:19:42Z)
Is Sora a World Simulator? A Comprehensive Survey on General World Models and Beyond [101.15395503285804]
General world models represent a crucial pathway toward achieving Artificial General Intelligence (AGI) In this survey, we embark on a comprehensive exploration of the latest advancements in world models. We examine challenges and limitations of world models, and discuss their potential future directions.
arXiv Detail & Related papers (2024-05-06T14:37:07Z)
Automated Statistical Model Discovery with Language Models [34.03743547761152]
We introduce a method for language model driven automated statistical model discovery. We cast our automated procedure within the principled framework of Box's Loop. Our results highlight the promise of LM-driven model discovery.
arXiv Detail & Related papers (2024-02-27T20:33:22Z)
Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents [111.15288256221764]
Grounded-decoding project aims to solve complex, long-horizon tasks in a robotic setting by leveraging the knowledge of both models. We frame this as a problem similar to probabilistic filtering: decode a sequence that both has high probability under the language model and high probability under a set of grounded model objectives. We demonstrate how such grounded models can be obtained across three simulation and real-world domains, and that the proposed decoding strategy is able to solve complex, long-horizon tasks in a robotic setting by leveraging the knowledge of both models.
arXiv Detail & Related papers (2023-03-01T22:58:50Z)
Evaluation of Categorical Generative Models -- Bridging the Gap Between Real and Synthetic Data [18.142397311464343]
We introduce an appropriately scalable evaluation method for generative models. We consider increasingly large probability spaces, which correspond to increasingly difficult modeling tasks. We validate our evaluation procedure with synthetic experiments on both synthetic generative models and current state-of-the-art categorical generative models.
arXiv Detail & Related papers (2022-10-28T21:05:25Z)
Synthetic Model Combination: An Instance-wise Approach to Unsupervised Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data. Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z)
Abstract Interpretation for Generalized Heuristic Search in Model-Based Planning [50.96320003643406]
Domain-general model-based planners often derive their generality by constructing searchs through the relaxation of symbolic world models. We illustrate how abstract interpretation can serve as a unifying framework for these abstractions, extending the reach of search to richer world models. Theses can also be integrated with learning, allowing agents to jumpstart planning in novel world models via abstraction-derived information.
arXiv Detail & Related papers (2022-08-05T00:22:11Z)
Modeling the Mistakes of Boundedly Rational Agents Within a Bayesian Theory of Mind [32.66203057545608]
We extend the Bayesian Theory of Mind framework to model boundedly rational agents who may have mistaken goals, plans, and actions. We present experiments eliciting human goal inferences in two domains: (i) a gridworld puzzle with gems locked behind doors, and (ii) a block-stacking domain.
arXiv Detail & Related papers (2021-06-24T18:00:03Z)
Plausible Counterfactuals: Auditing Deep Learning Classifiers with Realistic Adversarial Examples [84.8370546614042]
Black-box nature of Deep Learning models has posed unanswered questions about what they learn from data. Generative Adversarial Network (GAN) and multi-objectives are used to furnish a plausible attack to the audited model. Its utility is showcased within a human face classification task, unveiling the enormous potential of the proposed framework.
arXiv Detail & Related papers (2020-03-25T11:08:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.