Related papers: ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning

ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning

URL: http://arxiv.org/abs/2509.26255v2
Date: Wed, 01 Oct 2025 01:58:01 GMT
Title: ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning
Authors: Yichao Liang, Dat Nguyen, Cambridge Yang, Tianyang Li, Joshua B. Tenenbaum, Carl Edward Rasmussen, Adrian Weller, Zenna Tavares, Tom Silver, Kevin Ellis,
Abstract summary: We propose a framework for abstract world models that jointly learns symbolic state representations and causal processes for both endogenous actions and mechanisms.<n>Across five simulated tabletop robotics environments, the learned models enable fast planning that generalizes to held-out tasks with more objects and more complex goals, outperforming a range of baselines.
Score: 77.49815848173613
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Long-horizon embodied planning is challenging because the world does not only change through an agent's actions: exogenous processes (e.g., water heating, dominoes cascading) unfold concurrently with the agent's actions. We propose a framework for abstract world models that jointly learns (i) symbolic state representations and (ii) causal processes for both endogenous actions and exogenous mechanisms. Each causal process models the time course of a stochastic cause-effect relation. We learn these world models from limited data via variational Bayesian inference combined with LLM proposals. Across five simulated tabletop robotics environments, the learned models enable fast planning that generalizes to held-out tasks with more objects and more complex goals, outperforming a range of baselines.

Related papers

DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos [110.98100817695307]
We introduce DreamDojo, a foundation world model that learns diverse interactions and dexterous controls from 44k hours of egocentric human videos.<n>Our work enables several important applications based on generative world models, including live teleoperation, policy evaluation, and model-based planning.
arXiv Detail & Related papers (2026-02-06T18:49:43Z)
Social World Model-Augmented Mechanism Design Policy Learning [58.739456918502704]
We introduce SWM-AP (Social World Model-Augmented Mechanism Design Policy Learning), which learns a social world model hierarchically to enhance mechanism design.<n>We show that SWM-AP outperforms established model-based and model-free RL baselines in cumulative rewards and sample efficiency.
arXiv Detail & Related papers (2025-10-22T06:01:21Z)
SimuRA: Towards General Goal-Oriented Agent via Simulative Reasoning Architecture with LLM-Based World Model [88.04128601981145]
We introduce SimuRA, a goal-oriented architecture for generalized agentic reasoning.<n>modelname overcomes the limitations of autoregressive reasoning by introducing a world model for planning via simulation.<n>World-model-based planning, in particular, shows consistent advantage of up to 124% over autoregressive planning.
arXiv Detail & Related papers (2025-07-31T17:57:20Z)
Revisiting Multi-Agent World Modeling from a Diffusion-Inspired Perspective [54.77404771454794]
We develop a flexible and robust world model for Multi-Agent Reinforcement Learning (MARL) using diffusion models.<n>Our method, Diffusion-Inspired Multi-Agent world model (DIMA), achieves state-of-the-art performance across multiple multi-agent control benchmarks.
arXiv Detail & Related papers (2025-05-27T09:11:38Z)
Curiosity-Driven Imagination: Discovering Plan Operators and Learning Associated Policies for Open-World Adaptation [7.406934849952094]
Adapting quickly to dynamic, uncertain environments is a major challenge in robotics.<n>Traditional Task and Motion Planning approaches struggle to cope with unforeseen changes, are data-inefficient when adapting, and do not leverage world models during learning.<n>We address this issue with a hybrid planning and learning system that integrates two models: a low level neural network based model that learns transitions and drives exploration via an Intrinsic Curiosity Module (ICM)<n>Our evaluation in a robotic manipulation domain with sequential novelty injections demonstrates that our approach converges faster and outperforms state-of-the-art hybrid methods.
arXiv Detail & Related papers (2025-03-06T20:02:26Z)
Simplifying Latent Dynamics with Softly State-Invariant World Models [10.722955763425228]
We introduce the Parsimonious Latent Space Model (PLSM), a world model that regularizes the latent dynamics to make the effect of the agent's actions more predictable. We find that our regularization improves accuracy, generalization, and performance in downstream tasks.
arXiv Detail & Related papers (2024-01-31T13:52:11Z)
EgoPlan-Bench: Benchmarking Multimodal Large Language Models for Human-Level Planning [84.6451394629312]
We introduce EgoPlan-Bench, a benchmark to evaluate the planning abilities of MLLMs in real-world scenarios. We show that EgoPlan-Bench poses significant challenges, highlighting a substantial scope for improvement in MLLMs to achieve human-level task planning. We also present EgoPlan-IT, a specialized instruction-tuning dataset that effectively enhances model performance on EgoPlan-Bench.
arXiv Detail & Related papers (2023-12-11T03:35:58Z)
CoPAL: Corrective Planning of Robot Actions with Large Language Models [7.944803163555092]
We propose a system architecture that orchestrates a seamless interplay between cognitive levels, encompassing reasoning, planning, and motion generation.<n>At its core lies a novel replanning strategy that handles physically grounded, logical, and semantic errors in the generated plans.
arXiv Detail & Related papers (2023-10-11T07:39:42Z)
Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models.<n>Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning.<n>Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z)
Relax, it doesn't matter how you get there: A new self-supervised approach for multi-timescale behavior analysis [8.543808476554695]
We develop a multi-task representation learning model for behavior that combines two novel components. Our model ranks 1st overall and on all global tasks, and 1st or 2nd on 7 out of 9 frame-level tasks.
arXiv Detail & Related papers (2023-03-15T17:58:48Z)
Curious Exploration via Structured World Models Yields Zero-Shot Object Manipulation [19.840186443344]
We propose to use structured world models to incorporate inductive biases in the control loop to achieve sample-efficient exploration. Our method generates free-play behavior that starts to interact with objects early on and develops more complex behavior over time.
arXiv Detail & Related papers (2022-06-22T22:08:50Z)
Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy. We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space. We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.