Reinforcement Learning with Iterative Reasoning for Merging in Dense
Traffic
- URL: http://arxiv.org/abs/2005.11895v1
- Date: Mon, 25 May 2020 02:57:19 GMT
- Title: Reinforcement Learning with Iterative Reasoning for Merging in Dense
Traffic
- Authors: Maxime Bouton, Alireza Nakhaei, David Isele, Kikuo Fujimura, and Mykel
J. Kochenderfer
- Abstract summary: Maneuvering in dense traffic is a challenging task for autonomous vehicles.
We propose a combination of reinforcement learning and game theory to learn merging behaviors.
- Score: 41.46201285202203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Maneuvering in dense traffic is a challenging task for autonomous vehicles
because it requires reasoning about the stochastic behaviors of many other
participants. In addition, the agent must achieve the maneuver within a limited
time and distance. In this work, we propose a combination of reinforcement
learning and game theory to learn merging behaviors. We design a training
curriculum for a reinforcement learning agent using the concept of level-$k$
behavior. This approach exposes the agent to a broad variety of behaviors
during training, which promotes learning policies that are robust to model
discrepancies. We show that our approach learns more efficient policies than
traditional training methods.
Related papers
- Analyzing Closed-loop Training Techniques for Realistic Traffic Agent Models in Autonomous Highway Driving Simulations [4.486517725808305]
We provide an extensive comparative analysis of different training principles, with a focus on closed-loop methods for highway driving simulation.
We experimentally compare (i) open-loop vs. closed-loop multi-agent training, (ii) adversarial vs. deterministic supervised training, (iii) the impact of reinforcement losses, and (iv) the impact of training alongside log-replayed agents to identify suitable training techniques for realistic agent modeling.
arXiv Detail & Related papers (2024-10-21T13:16:58Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Developing Driving Strategies Efficiently: A Skill-Based Hierarchical
Reinforcement Learning Approach [0.7373617024876725]
Reinforcement learning is a common tool to model driver policies.
We propose skill-based" hierarchical driving strategies, where motion primitives are designed and used as high-level actions.
arXiv Detail & Related papers (2023-02-04T15:09:51Z) - Coach-assisted Multi-Agent Reinforcement Learning Framework for
Unexpected Crashed Agents [120.91291581594773]
We present a formal formulation of a cooperative multi-agent reinforcement learning system with unexpected crashes.
We propose a coach-assisted multi-agent reinforcement learning framework, which introduces a virtual coach agent to adjust the crash rate during training.
To the best of our knowledge, this work is the first to study the unexpected crashes in the multi-agent system.
arXiv Detail & Related papers (2022-03-16T08:22:45Z) - Rethinking Learning Dynamics in RL using Adversarial Networks [79.56118674435844]
We present a learning mechanism for reinforcement learning of closely related skills parameterized via a skill embedding space.
The main contribution of our work is to formulate an adversarial training regime for reinforcement learning with the help of entropy-regularized policy gradient formulation.
arXiv Detail & Related papers (2022-01-27T19:51:09Z) - PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via
Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning.
We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z) - Affordance-based Reinforcement Learning for Urban Driving [3.507764811554557]
We propose a deep reinforcement learning framework to learn optimal control policy using waypoints and low-dimensional visual representations.
We demonstrate that our agents when trained from scratch learn the tasks of lane-following, driving around inter-sections as well as stopping in front of other actors or traffic lights even in the dense traffic setting.
arXiv Detail & Related papers (2021-01-15T05:21:25Z) - Behaviorally Diverse Traffic Simulation via Reinforcement Learning [16.99423598448411]
This paper introduces an easily-tunable policy generation algorithm for autonomous driving agents.
The proposed algorithm balances diversity and driving skills by leveraging the representation and exploration abilities of deep reinforcement learning.
We experimentally show the effectiveness of our methods on several challenging intersection scenes.
arXiv Detail & Related papers (2020-11-11T12:49:11Z) - Behavior Priors for Efficient Reinforcement Learning [97.81587970962232]
We consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors.
We discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) and mutual information and curiosity based objectives.
We demonstrate the effectiveness of our framework by applying it to a range of simulated continuous control domains.
arXiv Detail & Related papers (2020-10-27T13:17:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.