Learning to Drive Using Sparse Imitation Reinforcement Learning
- URL: http://arxiv.org/abs/2205.12128v1
- Date: Tue, 24 May 2022 15:03:11 GMT
- Title: Learning to Drive Using Sparse Imitation Reinforcement Learning
- Authors: Yuci Han, Alper Yilmaz
- Abstract summary: We propose a hybrid end-to-end control policy that combines the sparse expert driving knowledge with reinforcement learning (RL) policy.
We experimentally validate the efficacy of proposed SIRL approach in a complex urban scenario within the CARLA simulator.
- Score: 0.5076419064097732
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: In this paper, we propose Sparse Imitation Reinforcement Learning (SIRL), a
hybrid end-to-end control policy that combines the sparse expert driving
knowledge with reinforcement learning (RL) policy for autonomous driving (AD)
task in CARLA simulation environment. The sparse expert is designed based on
hand-crafted rules which is suboptimal but provides a risk-averse strategy by
enforcing experience for critical scenarios such as pedestrian and vehicle
avoidance, and traffic light detection. As it has been demonstrated, training a
RL agent from scratch is data-inefficient and time consuming particularly for
the urban driving task, due to the complexity of situations stemming from the
vast size of state space. Our SIRL strategy provides a solution to solve these
problems by fusing the output distribution of the sparse expert policy and the
RL policy to generate a composite driving policy. With the guidance of the
sparse expert during the early training stage, SIRL strategy accelerates the
training process and keeps the RL exploration from causing a catastrophe
outcome, and ensures safe exploration. To some extent, the SIRL agent is
imitating the driving expert's behavior. At the same time, it continuously
gains knowledge during training therefore it keeps making improvement beyond
the sparse expert, and can surpass both the sparse expert and a traditional RL
agent. We experimentally validate the efficacy of proposed SIRL approach in a
complex urban scenario within the CARLA simulator. Besides, we compare the SIRL
agent's performance for risk-averse exploration and high learning efficiency
with the traditional RL approach. We additionally demonstrate the SIRL agent's
generalization ability to transfer the driving skill to unseen environment.
Related papers
- CIMRL: Combining IMitation and Reinforcement Learning for Safe Autonomous Driving [45.05135725542318]
IMitation and Reinforcement Learning (CIMRL) approach enables training driving policies in simulation through leveraging imitative motion priors and safety constraints.
By combining RL and imitation, we demonstrate our method achieves state-of-the-art results in closed loop simulation and real world driving benchmarks.
arXiv Detail & Related papers (2024-06-13T07:31:29Z) - RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum.
We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z) - Guided Online Distillation: Promoting Safe Reinforcement Learning by
Offline Demonstration [75.51109230296568]
We argue that extracting expert policy from offline data to guide online exploration is a promising solution to mitigate the conserveness issue.
We propose Guided Online Distillation (GOLD), an offline-to-online safe RL framework.
GOLD distills an offline DT policy into a lightweight policy network through guided online safe RL training, which outperforms both the offline DT policy and online safe RL algorithms.
arXiv Detail & Related papers (2023-09-18T00:22:59Z) - Action and Trajectory Planning for Urban Autonomous Driving with
Hierarchical Reinforcement Learning [1.3397650653650457]
We propose an action and trajectory planner using Hierarchical Reinforcement Learning (atHRL) method.
We empirically verify the efficacy of atHRL through extensive experiments in complex urban driving scenarios.
arXiv Detail & Related papers (2023-06-28T07:11:02Z) - Rethinking Closed-loop Training for Autonomous Driving [82.61418945804544]
We present the first empirical study which analyzes the effects of different training benchmark designs on the success of learning agents.
We propose trajectory value learning (TRAVL), an RL-based driving agent that performs planning with multistep look-ahead.
Our experiments show that TRAVL can learn much faster and produce safer maneuvers compared to all the baselines.
arXiv Detail & Related papers (2023-06-27T17:58:39Z) - Risk-Aware Reward Shaping of Reinforcement Learning Agents for
Autonomous Driving [6.613838702441967]
This paper investigates how to use risk-aware reward shaping to leverage the training and test performance of RL agents in autonomous driving.
We propose additional reshaped reward terms that encourage exploration and penalize risky driving behaviors.
arXiv Detail & Related papers (2023-06-05T20:10:36Z) - DeFIX: Detecting and Fixing Failure Scenarios with Reinforcement
Learning in Imitation Learning Based Autonomous Driving [0.0]
We present a Reinforcement Learning (RL) based methodology to DEtect and FIX failures of an IL agent.
DeFIX is a continuous learning framework, where extraction of failure scenarios and training of RL agents are executed in an infinite loop.
It is demonstrated that even with only one RL agent trained on failure scenario of an IL agent, DeFIX method is either competitive or does outperform state-of-the-art IL and RL based autonomous urban driving benchmarks.
arXiv Detail & Related papers (2022-10-29T10:58:43Z) - Constrained Reinforcement Learning for Robotics via Scenario-Based
Programming [64.07167316957533]
It is crucial to optimize the performance of DRL-based agents while providing guarantees about their behavior.
This paper presents a novel technique for incorporating domain-expert knowledge into a constrained DRL training loop.
Our experiments demonstrate that using our approach to leverage expert knowledge dramatically improves the safety and the performance of the agent.
arXiv Detail & Related papers (2022-06-20T07:19:38Z) - Automatically Learning Fallback Strategies with Model-Free Reinforcement
Learning in Safety-Critical Driving Scenarios [9.761912672523977]
We present a principled approach for a model-free Reinforcement Learning (RL) agent to capture multiple modes of behaviour in an environment.
We introduce an extra pseudo-reward term to the reward model, to encourage exploration to areas of state-space different from areas privileged by the optimal policy.
We show that we are able to learn useful policies that would have otherwise been missed out on during training, and unavailable to use when executing the control algorithm.
arXiv Detail & Related papers (2022-04-11T15:34:49Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous.
We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments.
We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.