A Composable Specification Language for Reinforcement Learning Tasks
- URL: http://arxiv.org/abs/2008.09293v2
- Date: Thu, 29 Oct 2020 15:02:43 GMT
- Title: A Composable Specification Language for Reinforcement Learning Tasks
- Authors: Kishor Jothimurugan, Rajeev Alur and Osbert Bastani
- Abstract summary: We propose a language for specifying complex control tasks, along with an algorithm that compiles specifications in our language into a reward function and automatically performs reward shaping.
We implement our approach in a tool called SPECTRL, and show that it outperforms several state-of-the-art baselines.
- Score: 23.08652058034537
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning is a promising approach for learning control policies
for robot tasks. However, specifying complex tasks (e.g., with multiple
objectives and safety constraints) can be challenging, since the user must
design a reward function that encodes the entire task. Furthermore, the user
often needs to manually shape the reward to ensure convergence of the learning
algorithm. We propose a language for specifying complex control tasks, along
with an algorithm that compiles specifications in our language into a reward
function and automatically performs reward shaping. We implement our approach
in a tool called SPECTRL, and show that it outperforms several state-of-the-art
baselines.
Related papers
- CurricuLLM: Automatic Task Curricula Design for Learning Complex Robot Skills using Large Language Models [19.73329768987112]
CurricuLLM is a curriculum learning tool for complex robot control tasks.
It generates subtasks that aid target task learning in natural language form.
It also translates natural language description of subtasks into executable code.
CurricuLLM can aid learning complex robot control tasks.
arXiv Detail & Related papers (2024-09-27T01:48:16Z) - Counting Reward Automata: Sample Efficient Reinforcement Learning
Through the Exploitation of Reward Function Structure [13.231546105751015]
We present counting reward automata-a finite state machine variant capable of modelling any reward function expressible as a formal language.
We prove that an agent equipped with such an abstract machine is able to solve a larger set of tasks than those utilising current approaches.
arXiv Detail & Related papers (2023-12-18T17:20:38Z) - Interactive Task Planning with Language Models [97.86399877812923]
An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals or distinct tasks, even during execution.
Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain-specific pretrained models.
We propose a simple framework that achieves interactive task planning with language models.
arXiv Detail & Related papers (2023-10-16T17:59:12Z) - LARG, Language-based Automatic Reward and Goal Generation [8.404316955848602]
We develop an approach that converts a text-based task description into its corresponding reward and goal-generation functions.
We evaluate our approach for robotic manipulation and demonstrate its ability to train and execute policies in a scalable manner.
arXiv Detail & Related papers (2023-06-19T14:52:39Z) - Automaton-Guided Curriculum Generation for Reinforcement Learning Agents [14.20447398253189]
Automaton-guided Curriculum Learning (AGCL) is a novel method for automatically generating curricula for the target task in the form of Directed Acyclic Graphs (DAGs)
AGCL encodes the specification in the form of a deterministic finite automaton (DFA), and then uses the DFA along with the Object-Oriented MDP representation to generate a curriculum as a DAG.
Experiments in gridworld and physics-based simulated robotics domains show that the curricula produced by AGCL achieve improved time-to-threshold performance.
arXiv Detail & Related papers (2023-04-11T15:14:31Z) - Divide & Conquer Imitation Learning [75.31752559017978]
Imitation Learning can be a powerful approach to bootstrap the learning process.
We present a novel algorithm designed to imitate complex robotic tasks from the states of an expert trajectory.
We show that our method imitates a non-holonomic navigation task and scales to a complex simulated robotic manipulation task with very high sample efficiency.
arXiv Detail & Related papers (2022-04-15T09:56:50Z) - Multi-Task Learning with Sequence-Conditioned Transporter Networks [67.57293592529517]
We aim to solve multi-task learning through the lens of sequence-conditioning and weighted sampling.
We propose a new suite of benchmark aimed at compositional tasks, MultiRavens, which allows defining custom task combinations.
Second, we propose a vision-based end-to-end system architecture, Sequence-Conditioned Transporter Networks, which augments Goal-Conditioned Transporter Networks with sequence-conditioning and weighted sampling.
arXiv Detail & Related papers (2021-09-15T21:19:11Z) - Reinforcement Learning Agent Training with Goals for Real World Tasks [3.747737951407512]
Reinforcement Learning (RL) is a promising approach for solving various control, optimization, and sequential decision making tasks.
We propose a specification language (Inkling Goal Specification) for complex control and optimization tasks.
We include a set of experiments showing that the proposed method provides great ease of use to specify a wide range of real world tasks.
arXiv Detail & Related papers (2021-07-21T23:21:16Z) - Outcome-Driven Reinforcement Learning via Variational Inference [95.82770132618862]
We discuss a new perspective on reinforcement learning, recasting it as the problem of inferring actions that achieve desired outcomes, rather than a problem of maximizing rewards.
To solve the resulting outcome-directed inference problem, we establish a novel variational inference formulation that allows us to derive a well-shaped reward function.
We empirically demonstrate that this method eliminates the need to design reward functions and leads to effective goal-directed behaviors.
arXiv Detail & Related papers (2021-04-20T18:16:21Z) - Towards Coordinated Robot Motions: End-to-End Learning of Motion
Policies on Transform Trees [63.31965375413414]
We propose to solve multi-task problems through learning structured policies from human demonstrations.
Our structured policy is inspired by RMPflow, a framework for combining subtask policies on different spaces.
We derive an end-to-end learning objective function that is suitable for the multi-task problem.
arXiv Detail & Related papers (2020-12-24T22:46:22Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.