Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration
- URL: http://arxiv.org/abs/2211.04786v2
- Date: Mon, 17 Apr 2023 09:18:28 GMT
- Title: Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration
- Authors: Alexandre Chenu, Olivier Serris, Olivier Sigaud and Nicolas
Perrin-Gilbert
- Abstract summary: We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
- Score: 68.94506047556412
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Reinforcement Learning has been successfully applied to learn robotic
control. However, the corresponding algorithms struggle when applied to
problems where the agent is only rewarded after achieving a complex task. In
this context, using demonstrations can significantly speed up the learning
process, but demonstrations can be costly to acquire. In this paper, we propose
to leverage a sequential bias to learn control policies for complex robotic
tasks using a single demonstration. To do so, our method learns a
goal-conditioned policy to control a system between successive low-dimensional
goals. This sequential goal-reaching approach raises a problem of compatibility
between successive goals: we need to ensure that the state resulting from
reaching a goal is compatible with the achievement of the following goals. To
tackle this problem, we present a new algorithm called DCIL-II. We show that
DCIL-II can solve with unprecedented sample efficiency some challenging
simulated tasks such as humanoid locomotion and stand-up as well as fast
running with a simulated Cassie robot. Our method leveraging sequentiality is a
step towards the resolution of complex robotic tasks under minimal
specification effort, a key feature for the next generation of autonomous
robots.
Related papers
- Single-Reset Divide & Conquer Imitation Learning [49.87201678501027]
Demonstrations are commonly used to speed up the learning process of Deep Reinforcement Learning algorithms.
Some algorithms have been developed to learn from a single demonstration.
arXiv Detail & Related papers (2024-02-14T17:59:47Z) - Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from
Offline Data [101.43350024175157]
Self-supervised learning has the potential to decrease the amount of human annotation and engineering effort required to learn control strategies.
Our work builds on prior work showing that the reinforcement learning (RL) itself can be cast as a self-supervised problem.
We demonstrate that a self-supervised RL algorithm based on contrastive learning can solve real-world, image-based robotic manipulation tasks.
arXiv Detail & Related papers (2023-06-06T01:36:56Z) - Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and
Heuristic Rule-based Methods for Object Manipulation [118.27432851053335]
This paper presents an overview and comparative analysis of our systems designed for the following two tracks in SAPIEN ManiSkill Challenge 2021: No Interaction Track.
The No Interaction track targets for learning policies from pre-collected demonstration trajectories.
In this track, we design a Heuristic Rule-based Method (HRM) to trigger high-quality object manipulation by decomposing the task into a series of sub-tasks.
For each sub-task, the simple rule-based controlling strategies are adopted to predict actions that can be applied to robotic arms.
arXiv Detail & Related papers (2022-06-13T16:20:42Z) - Bi-Manual Manipulation and Attachment via Sim-to-Real Reinforcement
Learning [23.164743388342803]
We study how to solve bi-manual tasks using reinforcement learning trained in simulation.
We also discuss modifications to our simulated environment which lead to effective training of RL policies.
In this work, we design a Connect Task, where the aim is for two robot arms to pick up and attach two blocks with magnetic connection points.
arXiv Detail & Related papers (2022-03-15T21:49:20Z) - Automatic Goal Generation using Dynamical Distance Learning [5.797847756967884]
Reinforcement Learning (RL) agents can learn to solve complex sequential decision making tasks by interacting with the environment.
In the field of multi-goal RL, where agents are required to reach multiple goals to solve complex tasks, improving sample efficiency can be especially challenging.
We propose a method for automatic goal generation using a dynamical distance function (DDF) in a self-supervised fashion.
arXiv Detail & Related papers (2021-11-07T16:23:56Z) - Automatic Curriculum Learning through Value Disagreement [95.19299356298876]
Continually solving new, unsolved tasks is the key to learning diverse behaviors.
In the multi-task domain, where an agent needs to reach multiple goals, the choice of training goals can largely affect sample efficiency.
We propose setting up an automatic curriculum for goals that the agent needs to solve.
We evaluate our method across 13 multi-goal robotic tasks and 5 navigation tasks, and demonstrate performance gains over current state-of-the-art methods.
arXiv Detail & Related papers (2020-06-17T03:58:25Z) - SQUIRL: Robust and Efficient Learning from Video Demonstration of
Long-Horizon Robotic Manipulation Tasks [8.756012472587601]
Deep reinforcement learning (RL) can be used to learn complex manipulation tasks.
RL requires the robot to collect a large amount of real-world experience.
S SQUIRL performs a new but related long-horizon task robustly given only a single video demonstration.
arXiv Detail & Related papers (2020-03-10T20:26:26Z) - Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection.
We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted.
In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.