GRI: General Reinforced Imitation and its Application to Vision-Based
Autonomous Driving
- URL: http://arxiv.org/abs/2111.08575v1
- Date: Tue, 16 Nov 2021 15:52:54 GMT
- Title: GRI: General Reinforced Imitation and its Application to Vision-Based
Autonomous Driving
- Authors: Raphael Chekroun, Marin Toromanoff, Sascha Hornauer, Fabien Moutarde
- Abstract summary: General Reinforced Imitation (GRI) is a novel method which combines benefits from exploration and expert data.
We show that our approach enables major improvements on vision-based autonomous driving in urban environments.
- Score: 9.030769176986057
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Deep reinforcement learning (DRL) has been demonstrated to be effective for
several complex decision-making applications such as autonomous driving and
robotics. However, DRL is notoriously limited by its high sample complexity and
its lack of stability. Prior knowledge, e.g. as expert demonstrations, is often
available but challenging to leverage to mitigate these issues. In this paper,
we propose General Reinforced Imitation (GRI), a novel method which combines
benefits from exploration and expert data and is straightforward to implement
over any off-policy RL algorithm. We make one simplifying hypothesis: expert
demonstrations can be seen as perfect data whose underlying policy gets a
constant high reward. Based on this assumption, GRI introduces the notion of
offline demonstration agents. This agent sends expert data which are processed
both concurrently and indistinguishably with the experiences coming from the
online RL exploration agent. We show that our approach enables major
improvements on vision-based autonomous driving in urban environments. We
further validate the GRI method on Mujoco continuous control tasks with
different off-policy RL algorithms. Our method ranked first on the CARLA
Leaderboard and outperforms World on Rails, the previous state-of-the-art, by
17%.
Related papers
- Privileged to Predicted: Towards Sensorimotor Reinforcement Learning for
Urban Driving [0.0]
Reinforcement Learning (RL) has the potential to surpass human performance in driving without needing any expert supervision.
We propose vision-based deep learning models to approximate the privileged representations from sensor data.
We shed light on the significance of the state representations in RL for autonomous driving and point to unresolved challenges for future research.
arXiv Detail & Related papers (2023-09-18T13:34:41Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - DriverGym: Democratising Reinforcement Learning for Autonomous Driving [75.91049219123899]
We propose DriverGym, an open-source environment for developing reinforcement learning algorithms for autonomous driving.
DriverGym provides access to more than 1000 hours of expert logged data and also supports reactive and data-driven agent behavior.
The performance of an RL policy can be easily validated on real-world data using our extensive and flexible closed-loop evaluation protocol.
arXiv Detail & Related papers (2021-11-12T11:47:08Z) - WAD: A Deep Reinforcement Learning Agent for Urban Autonomous Driving [8.401473551081747]
This paper introduces the DRL driven Watch and Drive (WAD) agent for end-to-end urban autonomous driving.
Motivated by recent advancements, the study aims to detect important objects/states in high dimensional spaces of CARLA and extract the latent state from them.
Our novel approach utilizing fewer resources, step-by-step learning of different driving tasks, hard episode termination policy, and reward mechanism has led our agents to achieve a 100% success rate on all driving tasks.
arXiv Detail & Related papers (2021-08-27T06:48:31Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - GDI: Rethinking What Makes Reinforcement Learning Different From
Supervised Learning [8.755783981297396]
We extend the basic paradigm of RL called the Generalized Policy Iteration (GPI) into a more generalized version, which is called the Generalized Data Distribution Iteration (GDI)
Our algorithm has achieved 9620.98% mean human normalized score (HNS), 1146.39% median HNS and 22 human world record breakthroughs (HWRB) using only 200 training frames.
arXiv Detail & Related papers (2021-06-11T08:31:12Z) - Learning Dexterous Manipulation from Suboptimal Experts [69.8017067648129]
Relative Entropy Q-Learning (REQ) is a simple policy algorithm that combines ideas from successful offline and conventional RL algorithms.
We show how REQ is also effective for general off-policy RL, offline RL, and RL from demonstrations.
arXiv Detail & Related papers (2020-10-16T18:48:49Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.