Reinforcement Learning Agent Training with Goals for Real World Tasks
- URL: http://arxiv.org/abs/2107.10390v1
- Date: Wed, 21 Jul 2021 23:21:16 GMT
- Title: Reinforcement Learning Agent Training with Goals for Real World Tasks
- Authors: Xuan Zhao and Marcos Campos
- Abstract summary: Reinforcement Learning (RL) is a promising approach for solving various control, optimization, and sequential decision making tasks.
We propose a specification language (Inkling Goal Specification) for complex control and optimization tasks.
We include a set of experiments showing that the proposed method provides great ease of use to specify a wide range of real world tasks.
- Score: 3.747737951407512
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Reinforcement Learning (RL) is a promising approach for solving various
control, optimization, and sequential decision making tasks. However, designing
reward functions for complex tasks (e.g., with multiple objectives and safety
constraints) can be challenging for most users and usually requires multiple
expensive trials (reward function hacking). In this paper we propose a
specification language (Inkling Goal Specification) for complex control and
optimization tasks, which is very close to natural language and allows a
practitioner to focus on problem specification instead of reward function
hacking. The core elements of our framework are: (i) mapping the high level
language to a predicate temporal logic tailored to control and optimization
tasks, (ii) a novel automaton-guided dense reward generation that can be used
to drive RL algorithms, and (iii) a set of performance metrics to assess the
behavior of the system. We include a set of experiments showing that the
proposed method provides great ease of use to specify a wide range of real
world tasks; and that the reward generated is able to drive the policy training
to achieve the specified goal.
Related papers
- Stage-Wise Reward Shaping for Acrobatic Robots: A Constrained Multi-Objective Reinforcement Learning Approach [12.132416927711036]
We introduce an RL method aimed at simplifying the reward-shaping process through intuitive strategies.
We define multiple reward and cost functions within a constrained multi-objective RL (CMORL) framework.
For tasks involving sequential complex movements, we segment the task into distinct stages and define multiple rewards and costs for each stage.
arXiv Detail & Related papers (2024-09-24T05:25:24Z) - Proximal Curriculum with Task Correlations for Deep Reinforcement Learning [25.10619062353793]
We consider curriculum design in contextual multi-task settings where the agent's final performance is measured w.r.t. a target distribution over complex tasks.
We propose a novel curriculum, ProCuRL-Target, that effectively balances the need for selecting tasks that are not too difficult for the agent while progressing the agent's learning toward the target distribution via leveraging task correlations.
arXiv Detail & Related papers (2024-05-03T21:07:54Z) - RL-GPT: Integrating Reinforcement Learning and Code-as-policy [82.1804241891039]
We introduce a two-level hierarchical framework, RL-GPT, comprising a slow agent and a fast agent.
The slow agent analyzes actions suitable for coding, while the fast agent executes coding tasks.
This decomposition effectively focuses each agent on specific tasks, proving highly efficient within our pipeline.
arXiv Detail & Related papers (2024-02-29T16:07:22Z) - Sample Efficient Reinforcement Learning by Automatically Learning to
Compose Subtasks [3.1594865504808944]
We propose an RL algorithm that automatically structure the reward function for sample efficiency, given a set of labels that signify subtasks.
We evaluate our algorithm in a variety of sparse-reward environments.
arXiv Detail & Related papers (2024-01-25T15:06:40Z) - Reinforcement Learning with Success Induced Task Prioritization [68.8204255655161]
We introduce Success Induced Task Prioritization (SITP), a framework for automatic curriculum learning.
The algorithm selects the order of tasks that provide the fastest learning for agents.
We demonstrate that SITP matches or surpasses the results of other curriculum design methods.
arXiv Detail & Related papers (2022-12-30T12:32:43Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - A Composable Specification Language for Reinforcement Learning Tasks [23.08652058034537]
We propose a language for specifying complex control tasks, along with an algorithm that compiles specifications in our language into a reward function and automatically performs reward shaping.
We implement our approach in a tool called SPECTRL, and show that it outperforms several state-of-the-art baselines.
arXiv Detail & Related papers (2020-08-21T03:40:57Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - Certified Reinforcement Learning with Logic Guidance [78.2286146954051]
We propose a model-free RL algorithm that enables the use of Linear Temporal Logic (LTL) to formulate a goal for unknown continuous-state/action Markov Decision Processes (MDPs)
The algorithm is guaranteed to synthesise a control policy whose traces satisfy the specification with maximal probability.
arXiv Detail & Related papers (2019-02-02T20:09:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.