Related papers: Planning on the fast lane: Learning to interact using attention mechanisms in path integral inverse reinforcement learning

Planning on the fast lane: Learning to interact using attention mechanisms in path integral inverse reinforcement learning

URL: http://arxiv.org/abs/2007.05798v2
Date: Sat, 12 Sep 2020 08:33:02 GMT
Title: Planning on the fast lane: Learning to interact using attention mechanisms in path integral inverse reinforcement learning
Authors: Sascha Rosbach, Xing Li, Simon Gro{\ss}johann, Silviu Homoceanu and Stefan Roth
Abstract summary: General-purpose trajectory planning algorithms for automated driving utilize complex reward functions. Deep learning approaches have been successfully applied to predict local situation-dependent reward functions. We present a neural network architecture that uses a policy attention mechanism to generate a low-dimensional context vector.
Score: 20.435909887810165
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: General-purpose trajectory planning algorithms for automated driving utilize complex reward functions to perform a combined optimization of strategic, behavioral, and kinematic features. The specification and tuning of a single reward function is a tedious task and does not generalize over a large set of traffic situations. Deep learning approaches based on path integral inverse reinforcement learning have been successfully applied to predict local situation-dependent reward functions using features of a set of sampled driving policies. Sample-based trajectory planning algorithms are able to approximate a spatio-temporal subspace of feasible driving policies that can be used to encode the context of a situation. However, the interaction with dynamic objects requires an extended planning horizon, which depends on sequential context modeling. In this work, we are concerned with the sequential reward prediction over an extended time horizon. We present a neural network architecture that uses a policy attention mechanism to generate a low-dimensional context vector by concentrating on trajectories with a human-like driving style. Apart from this, we propose a temporal attention mechanism to identify context switches and allow for stable adaptation of rewards. We evaluate our results on complex simulated driving situations, including other moving vehicles. Our evaluation shows that our policy attention mechanism learns to focus on collision-free policies in the configuration space. Furthermore, the temporal attention mechanism learns persistent interaction with other vehicles over an extended planning horizon.

Related papers

End-to-end Driving in High-Interaction Traffic Scenarios with Reinforcement Learning [24.578178308010912]
We propose an end-to-end model-based RL algorithm named Ramble to address these issues. By learning a dynamics model of the environment, Ramble can foresee upcoming traffic events and make more informed, strategic decisions. Ramble achieves state-of-the-art performance regarding route completion rate and driving score on the CARLA Leaderboard 2.0, showcasing its effectiveness in managing complex and dynamic traffic situations.
arXiv Detail & Related papers (2024-10-03T06:45:59Z)
Interactive Autonomous Navigation with Internal State Inference and Interactivity Estimation [58.21683603243387]
We propose three auxiliary tasks with relational-temporal reasoning and integrate them into the standard Deep Learning framework. These auxiliary tasks provide additional supervision signals to infer the behavior patterns other interactive agents. Our approach achieves robust and state-of-the-art performance in terms of standard evaluation metrics.
arXiv Detail & Related papers (2023-11-27T18:57:42Z)
PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving [57.89801036693292]
PPAD (Iterative Interaction of Prediction and Planning Autonomous Driving) considers the timestep-wise interaction to better integrate prediction and planning. We design ego-to-agent, ego-to-map, and ego-to-BEV interaction mechanisms with hierarchical dynamic key objects attention to better model the interactions.
arXiv Detail & Related papers (2023-11-14T11:53:24Z)
Integration of Reinforcement Learning Based Behavior Planning With Sampling Based Motion Planning for Automated Driving [0.5801044612920815]
We propose a method to employ a trained deep reinforcement learning policy for dedicated high-level behavior planning. To the best of our knowledge, this work is the first to apply deep reinforcement learning in this manner.
arXiv Detail & Related papers (2023-04-17T13:49:55Z)
GINK: Graph-based Interaction-aware Kinodynamic Planning via Reinforcement Learning for Autonomous Driving [10.782043595405831]
There are many challenges in applying deep reinforcement learning (D) to autonomous driving in a structured environment such as an urban area. In this paper, we suggest a new framework that effectively combines graph-based intention representation and reinforcement learning for dynamic planning. The experiments show the state-of-the-art performance of our approach compared to the existing baselines.
arXiv Detail & Related papers (2022-06-03T10:37:25Z)
Deep Interactive Motion Prediction and Planning: Playing Games with Motion Prediction Models [162.21629604674388]
This work presents a game-theoretic Model Predictive Controller (MPC) that uses a novel interactive multi-agent neural network policy as part of its predictive model. Fundamental to the success of our method is the design of a novel multi-agent policy network that can steer a vehicle given the state of the surrounding agents and the map information.
arXiv Detail & Related papers (2022-04-05T17:58:18Z)
Generating Useful Accident-Prone Driving Scenarios via a Learned Traffic Prior [135.78858513845233]
STRIVE is a method to automatically generate challenging scenarios that cause a given planner to produce undesirable behavior, like collisions. To maintain scenario plausibility, the key idea is to leverage a learned model of traffic motion in the form of a graph-based conditional VAE. A subsequent optimization is used to find a "solution" to the scenario, ensuring it is useful to improve the given planner.
arXiv Detail & Related papers (2021-12-09T18:03:27Z)
Learning off-road maneuver plans for autonomous vehicles [0.0]
This thesis explores the benefits machine learning algorithms can bring to online planning and scheduling for autonomous vehicles in off-road situations. We present a range of learning-baseds to assist different itinerary planners. In order to synthesize strategies to execute synchronized maneuvers, we propose a novel type of scheduling controllability and a learning-assisted algorithm.
arXiv Detail & Related papers (2021-08-02T16:27:59Z)
Real-world Ride-hailing Vehicle Repositioning using Deep Reinforcement Learning [52.2663102239029]
We present a new practical framework based on deep reinforcement learning and decision-time planning for real-world vehicle on idle-hailing platforms. Our approach learns ride-based state-value function using a batch training algorithm with deep value. We benchmark our algorithm with baselines in a ride-hailing simulation environment to demonstrate its superiority in improving income efficiency.
arXiv Detail & Related papers (2021-03-08T05:34:05Z)
An End-to-end Deep Reinforcement Learning Approach for the Long-term Short-term Planning on the Frenet Space [0.0]
This paper presents a novel end-to-end continuous deep reinforcement learning approach towards autonomous cars' decision-making and motion planning. For the first time, we define both states and action spaces on the Frenet space to make the driving behavior less variant to the road curvatures. The algorithm generates continuoustemporal trajectories on the Frenet frame for the feedback controller to track.
arXiv Detail & Related papers (2020-11-26T02:40:07Z)
Trajectory Planning for Autonomous Vehicles Using Hierarchical Reinforcement Learning [21.500697097095408]
Planning safe trajectories under uncertain and dynamic conditions makes the autonomous driving problem significantly complex. Current sampling-based methods such as Rapidly Exploring Random Trees (RRTs) are not ideal for this problem because of the high computational cost. We propose a Hierarchical Reinforcement Learning structure combined with a Proportional-Integral-Derivative (PID) controller for trajectory planning.
arXiv Detail & Related papers (2020-11-09T20:49:54Z)
Intelligent Roundabout Insertion using Deep Reinforcement Learning [68.8204255655161]
We present a maneuver planning module able to negotiate the entering in busy roundabouts. The proposed module is based on a neural network trained to predict when and how entering the roundabout throughout the whole duration of the maneuver.
arXiv Detail & Related papers (2020-01-03T11:16:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.