Online Bayesian Goal Inference for Boundedly-Rational Planning Agents
- URL: http://arxiv.org/abs/2006.07532v2
- Date: Sun, 25 Oct 2020 01:36:16 GMT
- Title: Online Bayesian Goal Inference for Boundedly-Rational Planning Agents
- Authors: Tan Zhi-Xuan, Jordyn L. Mann, Tom Silver, Joshua B. Tenenbaum, Vikash
K. Mansinghka
- Abstract summary: We present an architecture capable of inferring an agent's goals online from both optimal and non-optimal sequences of actions.
Our architecture models agents as boundedly-rational planners that interleave search with execution by replanning.
We develop Sequential Inverse Plan Search (SIPS), a sequential Monte Carlo algorithm that exploits the online replanning assumption of these models.
- Score: 46.60073262357339
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: People routinely infer the goals of others by observing their actions over
time. Remarkably, we can do so even when those actions lead to failure,
enabling us to assist others when we detect that they might not achieve their
goals. How might we endow machines with similar capabilities? Here we present
an architecture capable of inferring an agent's goals online from both optimal
and non-optimal sequences of actions. Our architecture models agents as
boundedly-rational planners that interleave search with execution by
replanning, thereby accounting for sub-optimal behavior. These models are
specified as probabilistic programs, allowing us to represent and perform
efficient Bayesian inference over an agent's goals and internal planning
processes. To perform such inference, we develop Sequential Inverse Plan Search
(SIPS), a sequential Monte Carlo algorithm that exploits the online replanning
assumption of these models, limiting computation by incrementally extending
inferred plans as new actions are observed. We present experiments showing that
this modeling and inference architecture outperforms Bayesian inverse
reinforcement learning baselines, accurately inferring goals from both optimal
and non-optimal trajectories involving failure and back-tracking, while
generalizing across domains with compositional structure and sparse rewards.
Related papers
- Closed-Loop Long-Horizon Robotic Planning via Equilibrium Sequence Modeling [23.62433580021779]
We advocate a self-refining scheme that iteratively refines a draft plan until an equilibrium is reached.
A nested equilibrium sequence modeling procedure is devised for efficient closed-loop planning.
Our method is evaluated on the VirtualHome-Env benchmark, showing advanced performance with better scaling for inference.
arXiv Detail & Related papers (2024-10-02T11:42:49Z) - Adaptive Planning with Generative Models under Uncertainty [20.922248169620783]
Planning with generative models has emerged as an effective decision-making paradigm across a wide range of domains.
While continuous replanning at each timestep might seem intuitive because it allows decisions to be made based on the most recent environmental observations, it results in substantial computational challenges.
Our work addresses this challenge by introducing a simple adaptive planning policy that leverages the generative model's ability to predict long-horizon state trajectories.
arXiv Detail & Related papers (2024-08-02T18:07:53Z) - On efficient computation in active inference [1.1470070927586016]
We present a novel planning algorithm for finite temporal horizons with drastically lower computational complexity.
We also simplify the process of setting an appropriate target distribution for new and existing active inference planning schemes.
arXiv Detail & Related papers (2023-07-02T07:38:56Z) - Planning with Sequence Models through Iterative Energy Minimization [22.594413287842574]
We suggest an approach towards integrating planning with sequence models based on the idea of iterative energy minimization.
We train a masked language model to capture an implicit energy function over trajectories of actions, and formulate planning as finding a trajectory of actions with minimum energy.
We illustrate how this procedure enables improved performance over recent approaches across BabyAI and Atari environments.
arXiv Detail & Related papers (2023-03-28T17:53:22Z) - When Demonstrations Meet Generative World Models: A Maximum Likelihood
Framework for Offline Inverse Reinforcement Learning [62.00672284480755]
This paper aims to recover the structure of rewards and environment dynamics that underlie observed actions in a fixed, finite set of demonstrations from an expert agent.
Accurate models of expertise in executing a task has applications in safety-sensitive applications such as clinical decision making and autonomous driving.
arXiv Detail & Related papers (2023-02-15T04:14:20Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Forethought and Hindsight in Credit Assignment [62.05690959741223]
We work to understand the gains and peculiarities of planning employed as forethought via forward models or as hindsight operating with backward models.
We investigate the best use of models in planning, primarily focusing on the selection of states in which predictions should be (re)-evaluated.
arXiv Detail & Related papers (2020-10-26T16:00:47Z) - Long-Horizon Visual Planning with Goal-Conditioned Hierarchical
Predictors [124.30562402952319]
The ability to predict and plan into the future is fundamental for agents acting in the world.
Current learning approaches for visual prediction and planning fail on long-horizon tasks.
We propose a framework for visual prediction and planning that is able to overcome both of these limitations.
arXiv Detail & Related papers (2020-06-23T17:58:56Z) - PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals [14.315501760755609]
PlanGAN is a model-based algorithm for solving multi-goal tasks in environments with sparse rewards.
Our studies indicate that PlanGAN can achieve comparable performance whilst being around 4-8 times more sample efficient.
arXiv Detail & Related papers (2020-06-01T12:53:09Z) - Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning [78.65083326918351]
We consider alternatives to an implicit sequential planning assumption.
We propose Divide-and-Conquer Monte Carlo Tree Search (DC-MCTS) for approximating the optimal plan.
We show that this algorithmic flexibility over planning order leads to improved results in navigation tasks in grid-worlds.
arXiv Detail & Related papers (2020-04-23T18:08:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.