Robust Driving Policy Learning with Guided Meta Reinforcement Learning
- URL: http://arxiv.org/abs/2307.10160v1
- Date: Wed, 19 Jul 2023 17:42:36 GMT
- Title: Robust Driving Policy Learning with Guided Meta Reinforcement Learning
- Authors: Kanghoon Lee, Jiachen Li, David Isele, Jinkyoo Park, Kikuo Fujimura,
Mykel J. Kochenderfer
- Abstract summary: We introduce an efficient method to train diverse driving policies for social vehicles as a single meta-policy.
By randomizing the interaction-based reward functions of social vehicles, we can generate diverse objectives and efficiently train the meta-policy.
We propose a training strategy to enhance the robustness of the ego vehicle's driving policy using the environment where social vehicles are controlled by the learned meta-policy.
- Score: 49.860391298275616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although deep reinforcement learning (DRL) has shown promising results for
autonomous navigation in interactive traffic scenarios, existing work typically
adopts a fixed behavior policy to control social vehicles in the training
environment. This may cause the learned driving policy to overfit the
environment, making it difficult to interact well with vehicles with different,
unseen behaviors. In this work, we introduce an efficient method to train
diverse driving policies for social vehicles as a single meta-policy. By
randomizing the interaction-based reward functions of social vehicles, we can
generate diverse objectives and efficiently train the meta-policy through
guiding policies that achieve specific objectives. We further propose a
training strategy to enhance the robustness of the ego vehicle's driving policy
using the environment where social vehicles are controlled by the learned
meta-policy. Our method successfully learns an ego driving policy that
generalizes well to unseen situations with out-of-distribution (OOD) social
agents' behaviors in a challenging uncontrolled T-intersection scenario.
Related papers
- RACER: Epistemic Risk-Sensitive RL Enables Fast Driving with Fewer Crashes [57.319845580050924]
We propose a reinforcement learning framework that combines risk-sensitive control with an adaptive action space curriculum.
We show that our algorithm is capable of learning high-speed policies for a real-world off-road driving task.
arXiv Detail & Related papers (2024-05-07T23:32:36Z) - Exploring the trade off between human driving imitation and safety for
traffic simulation [0.34410212782758043]
We show that a trade-off exists between imitating human driving and maintaining safety when learning driving policies.
We propose a multi objective learning algorithm (MOPPO) that improves both objectives together.
arXiv Detail & Related papers (2022-08-09T14:30:19Z) - Automatically Learning Fallback Strategies with Model-Free Reinforcement
Learning in Safety-Critical Driving Scenarios [9.761912672523977]
We present a principled approach for a model-free Reinforcement Learning (RL) agent to capture multiple modes of behaviour in an environment.
We introduce an extra pseudo-reward term to the reward model, to encourage exploration to areas of state-space different from areas privileged by the optimal policy.
We show that we are able to learn useful policies that would have otherwise been missed out on during training, and unavailable to use when executing the control algorithm.
arXiv Detail & Related papers (2022-04-11T15:34:49Z) - Learning Interactive Driving Policies via Data-driven Simulation [125.97811179463542]
Data-driven simulators promise high data-efficiency for driving policy learning.
Small underlying datasets often lack interesting and challenging edge cases for learning interactive driving.
We propose a simulation method that uses in-painted ado vehicles for learning robust driving policies.
arXiv Detail & Related papers (2021-11-23T20:14:02Z) - Learning Interaction-aware Guidance Policies for Motion Planning in
Dense Traffic Scenarios [8.484564880157148]
This paper presents a novel framework for interaction-aware motion planning in dense traffic scenarios.
We propose to learn, via deep Reinforcement Learning (RL), an interaction-aware policy providing global guidance about the cooperativeness of other vehicles.
The learned policy can reason and guide the local optimization-based planner with interactive behavior to pro-actively merge in dense traffic while remaining safe in case the other vehicles do not yield.
arXiv Detail & Related papers (2021-07-09T16:43:12Z) - Learning to drive from a world on rails [78.28647825246472]
We learn an interactive vision-based driving policy from pre-recorded driving logs via a model-based approach.
A forward model of the world supervises a driving policy that predicts the outcome of any potential driving trajectory.
Our method ranks first on the CARLA leaderboard, attaining a 25% higher driving score while using 40 times less data.
arXiv Detail & Related papers (2021-05-03T05:55:30Z) - MetaVIM: Meta Variationally Intrinsic Motivated Reinforcement Learning for Decentralized Traffic Signal Control [54.162449208797334]
Traffic signal control aims to coordinate traffic signals across intersections to improve the traffic efficiency of a district or a city.
Deep reinforcement learning (RL) has been applied to traffic signal control recently and demonstrated promising performance where each traffic signal is regarded as an agent.
We propose a novel Meta Variationally Intrinsic Motivated (MetaVIM) RL method to learn the decentralized policy for each intersection that considers neighbor information in a latent way.
arXiv Detail & Related papers (2021-01-04T03:06:08Z) - Reinforcement Learning based Control of Imitative Policies for
Near-Accident Driving [41.54021613421446]
In near-accident scenarios, even a minor change in the vehicle's actions may result in drastically different consequences.
We propose a hierarchical reinforcement and imitation learning (H-ReIL) approach that consists of low-level policies learned by IL for discrete driving modes, and a high-level policy learned by RL that switches between different driving modes.
arXiv Detail & Related papers (2020-07-01T01:41:45Z) - Intelligent Roundabout Insertion using Deep Reinforcement Learning [68.8204255655161]
We present a maneuver planning module able to negotiate the entering in busy roundabouts.
The proposed module is based on a neural network trained to predict when and how entering the roundabout throughout the whole duration of the maneuver.
arXiv Detail & Related papers (2020-01-03T11:16:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.