Watch and Match: Supercharging Imitation with Regularized Optimal
Transport
- URL: http://arxiv.org/abs/2206.15469v1
- Date: Thu, 30 Jun 2022 17:58:18 GMT
- Title: Watch and Match: Supercharging Imitation with Regularized Optimal
Transport
- Authors: Siddhant Haldar and Vaibhav Mathur and Denis Yarats and Lerrel Pinto
- Abstract summary: Regularized Optimal Transport (ROT) is a new imitation learning algorithm that builds on recent advances in optimal transport based trajectory-matching.
Our experiments on 20 visual control tasks across the DeepMind Control Suite, the OpenAI Robotics Suite, and the Meta-World Benchmark demonstrate an average of 7.8X faster imitation to reach 90% of expert performance.
- Score: 28.3572924961148
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Imitation learning holds tremendous promise in learning policies efficiently
for complex decision making problems. Current state-of-the-art algorithms often
use inverse reinforcement learning (IRL), where given a set of expert
demonstrations, an agent alternatively infers a reward function and the
associated optimal policy. However, such IRL approaches often require
substantial online interactions for complex control problems. In this work, we
present Regularized Optimal Transport (ROT), a new imitation learning algorithm
that builds on recent advances in optimal transport based trajectory-matching.
Our key technical insight is that adaptively combining trajectory-matching
rewards with behavior cloning can significantly accelerate imitation even with
only a few demonstrations. Our experiments on 20 visual control tasks across
the DeepMind Control Suite, the OpenAI Robotics Suite, and the Meta-World
Benchmark demonstrate an average of 7.8X faster imitation to reach 90% of
expert performance compared to prior state-of-the-art methods. On real-world
robotic manipulation, with just one demonstration and an hour of online
training, ROT achieves an average success rate of 90.1% across 14 tasks.
Related papers
- Offline Imitation Learning Through Graph Search and Retrieval [57.57306578140857]
Imitation learning is a powerful machine learning algorithm for a robot to acquire manipulation skills.
We propose GSR, a simple yet effective algorithm that learns from suboptimal demonstrations through Graph Search and Retrieval.
GSR can achieve a 10% to 30% higher success rate and over 30% higher proficiency compared to baselines.
arXiv Detail & Related papers (2024-07-22T06:12:21Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - ReProHRL: Towards Multi-Goal Navigation in the Real World using
Hierarchical Agents [1.3194749469702445]
We present Ready for Production Hierarchical RL (ReProHRL) that divides tasks with hierarchical multi-goal navigation guided by reinforcement learning.
We also use object detectors as a pre-processing step to learn multi-goal navigation and transfer it to the real world.
For the real-world implementation and proof of concept demonstration, we deploy the proposed method on a nano-drone named Crazyflie with a front camera.
arXiv Detail & Related papers (2023-08-17T02:23:59Z) - Practical Imitation Learning in the Real World via Task Consistency Loss [18.827979446629296]
This paper introduces a self-supervised loss that encourages sim and real alignment both at the feature and action-prediction levels.
We achieve 80% success across ten seen and unseen scenes using only 16.2 hours of teleoperated demonstrations in sim and real.
arXiv Detail & Related papers (2022-02-03T21:43:06Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - Vision-Based Autonomous Car Racing Using Deep Imitative Reinforcement
Learning [13.699336307578488]
Deep imitative reinforcement learning approach (DIRL) achieves agile autonomous racing using visual inputs.
We validate our algorithm both in a high-fidelity driving simulation and on a real-world 1/20-scale RC-car with limited onboard computation.
arXiv Detail & Related papers (2021-07-18T00:00:48Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - Reinforcement Learning Experiments and Benchmark for Solving Robotic
Reaching Tasks [0.0]
Reinforcement learning has been successfully applied to solving the reaching task with robotic arms.
It is shown that augmenting the reward signal with the Hindsight Experience Replay exploration technique increases the average return of off-policy agents.
arXiv Detail & Related papers (2020-11-11T14:00:49Z) - Learning Dexterous Manipulation from Suboptimal Experts [69.8017067648129]
Relative Entropy Q-Learning (REQ) is a simple policy algorithm that combines ideas from successful offline and conventional RL algorithms.
We show how REQ is also effective for general off-policy RL, offline RL, and RL from demonstrations.
arXiv Detail & Related papers (2020-10-16T18:48:49Z) - Assembly robots with optimized control stiffness through reinforcement
learning [3.4410212782758047]
We propose a methodology that uses reinforcement learning to achieve high performance in robots.
The proposed method ensures the online generation of stiffness matrices that help improve the performance of local trajectory optimization.
The effectiveness of the method was verified via experiments involving two contact-rich tasks.
arXiv Detail & Related papers (2020-02-27T15:54:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.