Code Models are Zero-shot Precondition Reasoners
- URL: http://arxiv.org/abs/2311.09601v1
- Date: Thu, 16 Nov 2023 06:19:27 GMT
- Title: Code Models are Zero-shot Precondition Reasoners
- Authors: Lajanugen Logeswaran, Sungryull Sohn, Yiwei Lyu, Anthony Zhe Liu,
Dong-Ki Kim, Dongsub Shim, Moontae Lee, Honglak Lee
- Abstract summary: We use code representations to reason about action preconditions for sequential decision making tasks.
We propose a precondition-aware action sampling strategy that ensures actions predicted by a policy are consistent with preconditions.
- Score: 83.8561159080672
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: One of the fundamental skills required for an agent acting in an environment
to complete tasks is the ability to understand what actions are plausible at
any given point. This work explores a novel use of code representations to
reason about action preconditions for sequential decision making tasks. Code
representations offer the flexibility to model procedural activities and
associated constraints as well as the ability to execute and verify constraint
satisfaction. Leveraging code representations, we extract action preconditions
from demonstration trajectories in a zero-shot manner using pre-trained code
models. Given these extracted preconditions, we propose a precondition-aware
action sampling strategy that ensures actions predicted by a policy are
consistent with preconditions. We demonstrate that the proposed approach
enhances the performance of few-shot policy learning approaches across
task-oriented dialog and embodied textworld benchmarks.
Related papers
- Imagination Policy: Using Generative Point Cloud Models for Learning Manipulation Policies [25.760946763103483]
We propose Imagination Policy, a novel multi-task key-frame policy network for solving high-precision pick and place tasks.
Instead of learning actions directly, Imagination Policy generates point clouds to imagine desired states which are then translated to actions using rigid action estimation.
arXiv Detail & Related papers (2024-06-17T17:00:41Z) - Learning to Generate All Feasible Actions [4.333208181196761]
We introduce action mapping, a novel approach that divides the learning process into two steps: first learn feasibility and subsequently, the objective.
This paper focuses on the feasibility part by learning to generate all feasible actions through self-supervised querying of the feasibility model.
We demonstrate the agent's proficiency in generating actions across disconnected feasible action sets.
arXiv Detail & Related papers (2023-01-26T23:15:51Z) - Regularized Soft Actor-Critic for Behavior Transfer Learning [10.519534498340482]
Existing imitation learning methods mainly focus on making an agent effectively mimic a demonstrated behavior.
We propose a method called Regularized Soft Actor-Critic which formulates the main task and the imitation task.
We evaluate our method on continuous control tasks relevant to video games applications.
arXiv Detail & Related papers (2022-09-27T07:52:04Z) - Reinforcement Learning for Task Specifications with Action-Constraints [4.046919218061427]
We propose a method to learn optimal control policies for a finite-state Markov Decision Process.
We assume that the set of action sequences that are deemed unsafe and/or safe are given in terms of a finite-state automaton.
We present a version of the Q-learning algorithm for learning optimal policies in the presence of non-Markovian action-sequence and state constraints.
arXiv Detail & Related papers (2022-01-02T04:22:01Z) - Conditional Meta-Learning of Linear Representations [57.90025697492041]
Standard meta-learning for representation learning aims to find a common representation to be shared across multiple tasks.
In this work we overcome this issue by inferring a conditioning function, mapping the tasks' side information into a representation tailored to the task at hand.
We propose a meta-algorithm capable of leveraging this advantage in practice.
arXiv Detail & Related papers (2021-03-30T12:02:14Z) - Instance-Aware Predictive Navigation in Multi-Agent Environments [93.15055834395304]
We propose an Instance-Aware Predictive Control (IPC) approach, which forecasts interactions between agents as well as future scene structures.
We adopt a novel multi-instance event prediction module to estimate the possible interaction among agents in the ego-centric view.
We design a sequential action sampling strategy to better leverage predicted states on both scene-level and instance-level.
arXiv Detail & Related papers (2021-01-14T22:21:25Z) - Goal-Aware Prediction: Learning to Model What Matters [105.43098326577434]
One of the fundamental challenges in using a learned forward dynamics model is the mismatch between the objective of the learned model and that of the downstream planner or policy.
We propose to direct prediction towards task relevant information, enabling the model to be aware of the current task and encouraging it to only model relevant quantities of the state space.
We find that our method more effectively models the relevant parts of the scene conditioned on the goal, and as a result outperforms standard task-agnostic dynamics models and model-free reinforcement learning.
arXiv Detail & Related papers (2020-07-14T16:42:59Z) - Inferring Temporal Compositions of Actions Using Probabilistic Automata [61.09176771931052]
We propose to express temporal compositions of actions as semantic regular expressions and derive an inference framework using probabilistic automata.
Our approach is different from existing works that either predict long-range complex activities as unordered sets of atomic actions, or retrieve videos using natural language sentences.
arXiv Detail & Related papers (2020-04-28T00:15:26Z) - Goal-Conditioned End-to-End Visuomotor Control for Versatile Skill
Primitives [89.34229413345541]
We propose a conditioning scheme which avoids pitfalls by learning the controller and its conditioning in an end-to-end manner.
Our model predicts complex action sequences based directly on a dynamic image representation of the robot motion.
We report significant improvements in task success over representative MPC and IL baselines.
arXiv Detail & Related papers (2020-03-19T15:04:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.