Related papers: Motion Planning for Autonomous Vehicles in the Presence of Uncertainty Using Reinforcement Learning

Motion Planning for Autonomous Vehicles in the Presence of Uncertainty Using Reinforcement Learning

URL: http://arxiv.org/abs/2110.00640v1
Date: Fri, 1 Oct 2021 20:32:25 GMT
Title: Motion Planning for Autonomous Vehicles in the Presence of Uncertainty Using Reinforcement Learning
Authors: Kasra Rezaee, Peyman Yadmellat, Simon Chamorro
Abstract summary: Motion planning under uncertainty is one of the main challenges in developing autonomous driving vehicles. We propose a reinforcement learning based solution to manage uncertainty by optimizing for the worst case outcome. The proposed approach yields much better motion planning behavior compared to conventional RL algorithms and behaves comparably to humans driving style.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Motion planning under uncertainty is one of the main challenges in developing autonomous driving vehicles. In this work, we focus on the uncertainty in sensing and perception, resulted from a limited field of view, occlusions, and sensing range. This problem is often tackled by considering hypothetical hidden objects in occluded areas or beyond the sensing range to guarantee passive safety. However, this may result in conservative planning and expensive computation, particularly when numerous hypothetical objects need to be considered. We propose a reinforcement learning (RL) based solution to manage uncertainty by optimizing for the worst case outcome. This approach is in contrast to traditional RL, where the agents try to maximize the average expected reward. The proposed approach is built on top of the Distributional RL with its policy optimization maximizing the stochastic outcomes' lower bound. This modification can be applied to a range of RL algorithms. As a proof-of-concept, the approach is applied to two different RL algorithms, Soft Actor-Critic and DQN. The approach is evaluated against two challenging scenarios of pedestrians crossing with occlusion and curved roads with a limited field of view. The algorithm is trained and evaluated using the SUMO traffic simulator. The proposed approach yields much better motion planning behavior compared to conventional RL algorithms and behaves comparably to humans driving style.

Related papers

Evaluating Scenario-based Decision-making for Interactive Autonomous Driving Using Rational Criteria: A Survey [14.51227749657833]
This survey reviews the application of deep reinforcement learning (DRL) algorithms in autonomous driving across typical scenarios. The scenarios include highways, on-ramp merging, roundabouts, and unsignalized intersections. DRL-based algorithms are evaluated based on five rationale criteria: driving safety, driving efficiency, training efficiency, unselfishness, and interpretability.
arXiv Detail & Related papers (2025-01-03T16:37:52Z)
One-Shot Safety Alignment for Large Language Models via Optimal Dualization [64.52223677468861]
This paper presents a perspective of dualization that reduces constrained alignment to an equivalent unconstrained alignment problem. We do so by pre-optimizing a smooth and convex dual function that has a closed form. Our strategy leads to two practical algorithms in model-based and preference-based settings.
arXiv Detail & Related papers (2024-05-29T22:12:52Z)
REBEL: Reward Regularization-Based Approach for Robotic Reinforcement Learning from Human Feedback [61.54791065013767]
A misalignment between the reward function and human preferences can lead to catastrophic outcomes in the real world. Recent methods aim to mitigate misalignment by learning reward functions from human preferences. We propose a novel concept of reward regularization within the robotic RLHF framework.
arXiv Detail & Related papers (2023-12-22T04:56:37Z)
Integrating Higher-Order Dynamics and Roadway-Compliance into Constrained ILQR-based Trajectory Planning for Autonomous Vehicles [3.200238632208686]
Trajectory planning aims to produce a globally optimal route for Autonomous Passenger Vehicles. Existing implementations utilizing the vehicle bicycle kinematic model may not guarantee controllable trajectories. We augment this model by higher-order terms, including the first and second-order derivatives of curvature and longitudinal jerk.
arXiv Detail & Related papers (2023-09-25T22:30:18Z)
Action and Trajectory Planning for Urban Autonomous Driving with Hierarchical Reinforcement Learning [1.3397650653650457]
We propose an action and trajectory planner using Hierarchical Reinforcement Learning (atHRL) method. We empirically verify the efficacy of atHRL through extensive experiments in complex urban driving scenarios.
arXiv Detail & Related papers (2023-06-28T07:11:02Z)
A Multiplicative Value Function for Safe and Efficient Reinforcement Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic. The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns. We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z)
Offline Policy Optimization in RL with Variance Regularizaton [142.87345258222942]
We propose variance regularization for offline RL algorithms, using stationary distribution corrections. We show that by using Fenchel duality, we can avoid double sampling issues for computing the gradient of the variance regularizer. The proposed algorithm for offline variance regularization (OVAR) can be used to augment any existing offline policy optimization algorithms.
arXiv Detail & Related papers (2022-12-29T18:25:01Z)
Learning-based Preference Prediction for Constrained Multi-Criteria Path-Planning [12.457788665461312]
Constrained path-planning for Autonomous Ground Vehicles (AGV) is one such application. We leverage knowledge acquired through offline simulations by training a neural network model to predict the uncertain criterion. We integrate this model inside a path-planner which can solve problems online.
arXiv Detail & Related papers (2021-08-02T17:13:45Z)
Behavior Planning at Urban Intersections through Hierarchical Reinforcement Learning [25.50973559614565]
In this work, we propose a behavior planning structure based on reinforcement learning (RL) which is capable of performing autonomous vehicle behavior planning with a hierarchical structure in simulated urban environments. Our algorithms can perform better than rule-based methods for elective decisions such as when to turn left between vehicles approaching from the opposite direction or possible lane-change when approaching an intersection due to lane blockage or delay in front of the ego car. Results also show that the proposed method converges to an optimal policy faster than traditional RL methods.
arXiv Detail & Related papers (2020-11-09T19:23:26Z)
Decision-making for Autonomous Vehicles on Highway: Deep Reinforcement Learning with Continuous Action Horizon [14.059728921828938]
This paper utilizes the deep reinforcement learning (DRL) method to address the continuous-horizon decision-making problem on the highway. The running objective of the ego automated vehicle is to execute an efficient and smooth policy without collision. The PPO-DRL-based decision-making strategy is estimated from multiple perspectives, including the optimality, learning efficiency, and adaptability.
arXiv Detail & Related papers (2020-08-26T22:49:27Z)
Reinforcement Learning for Low-Thrust Trajectory Design of Interplanetary Missions [77.34726150561087]
This paper investigates the use of reinforcement learning for the robust design of interplanetary trajectories in presence of severe disturbances. An open-source implementation of the state-of-the-art algorithm Proximal Policy Optimization is adopted. The resulting Guidance and Control Network provides both a robust nominal trajectory and the associated closed-loop guidance law.
arXiv Detail & Related papers (2020-08-19T15:22:15Z)
The Importance of Prior Knowledge in Precise Multimodal Prediction [71.74884391209955]
Roads have well defined geometries, topologies, and traffic rules. In this paper we propose to incorporate structured priors as a loss function. We demonstrate the effectiveness of our approach on real-world self-driving datasets.
arXiv Detail & Related papers (2020-06-04T03:56:11Z)
Guided Constrained Policy Optimization for Dynamic Quadrupedal Robot Locomotion [78.46388769788405]
We introduce guided constrained policy optimization (GCPO), an RL framework based upon our implementation of constrained policy optimization (CPPO) We show that guided constrained RL offers faster convergence close to the desired optimum resulting in an optimal, yet physically feasible, robotic control behavior without the need for precise reward function tuning.
arXiv Detail & Related papers (2020-02-22T10:15:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.