Weakly Supervised Reinforcement Learning for Autonomous Highway Driving
via Virtual Safety Cages
- URL: http://arxiv.org/abs/2103.09726v1
- Date: Wed, 17 Mar 2021 15:30:36 GMT
- Title: Weakly Supervised Reinforcement Learning for Autonomous Highway Driving
via Virtual Safety Cages
- Authors: Sampo Kuutti, Richard Bowden, Saber Fallah
- Abstract summary: We present a reinforcement learning based approach to autonomous vehicle longitudinal control, where the rule-based safety cages provide enhanced safety for the vehicle as well as weak supervision to the reinforcement learning agent.
We show that when the model parameters are constrained or sub-optimal, the safety cages can enable a model to learn a safe driving policy even when the model could not be trained to drive through reinforcement learning alone.
- Score: 42.57240271305088
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The use of neural networks and reinforcement learning has become increasingly
popular in autonomous vehicle control. However, the opaqueness of the resulting
control policies presents a significant barrier to deploying neural
network-based control in autonomous vehicles. In this paper, we present a
reinforcement learning based approach to autonomous vehicle longitudinal
control, where the rule-based safety cages provide enhanced safety for the
vehicle as well as weak supervision to the reinforcement learning agent. By
guiding the agent to meaningful states and actions, this weak supervision
improves the convergence during training and enhances the safety of the final
trained policy. This rule-based supervisory controller has the further
advantage of being fully interpretable, thereby enabling traditional validation
and verification approaches to ensure the safety of the vehicle. We compare
models with and without safety cages, as well as models with optimal and
constrained model parameters, and show that the weak supervision consistently
improves the safety of exploration, speed of convergence, and model
performance. Additionally, we show that when the model parameters are
constrained or sub-optimal, the safety cages can enable a model to learn a safe
driving policy even when the model could not be trained to drive through
reinforcement learning alone.
Related papers
- Empowering Autonomous Driving with Large Language Models: A Safety Perspective [82.90376711290808]
This paper explores the integration of Large Language Models (LLMs) into Autonomous Driving systems.
LLMs are intelligent decision-makers in behavioral planning, augmented with a safety verifier shield for contextual safety learning.
We present two key studies in a simulated environment: an adaptive LLM-conditioned Model Predictive Control (MPC) and an LLM-enabled interactive behavior planning scheme with a state machine.
arXiv Detail & Related papers (2023-11-28T03:13:09Z) - Evaluation of Safety Constraints in Autonomous Navigation with Deep
Reinforcement Learning [62.997667081978825]
We compare two learnable navigation policies: safe and unsafe.
The safe policy takes the constraints into the account, while the other does not.
We show that the safe policy is able to generate trajectories with more clearance (distance to the obstacles) and makes less collisions while training without sacrificing the overall performance.
arXiv Detail & Related papers (2023-07-27T01:04:57Z) - ConBaT: Control Barrier Transformer for Safe Policy Learning [26.023275758215423]
Control Barrier Transformer (ConBaT) is an approach that learns safe behaviors from demonstrations in a self-supervised fashion.
During deployment, we employ a lightweight online optimization to find actions that ensure future states lie within the learned safe set.
arXiv Detail & Related papers (2023-03-07T20:04:28Z) - ISAACS: Iterative Soft Adversarial Actor-Critic for Safety [0.9217021281095907]
This work introduces a novel approach enabling scalable synthesis of robust safety-preserving controllers for robotic systems.
A safety-seeking fallback policy is co-trained with an adversarial "disturbance" agent that aims to invoke the worst-case realization of model error.
While the learned control policy does not intrinsically guarantee safety, it is used to construct a real-time safety filter.
arXiv Detail & Related papers (2022-12-06T18:53:34Z) - Differentiable Control Barrier Functions for Vision-based End-to-End
Autonomous Driving [100.57791628642624]
We introduce a safety guaranteed learning framework for vision-based end-to-end autonomous driving.
We design a learning system equipped with differentiable control barrier functions (dCBFs) that is trained end-to-end by gradient descent.
arXiv Detail & Related papers (2022-03-04T16:14:33Z) - Model-Reference Reinforcement Learning for Collision-Free Tracking
Control of Autonomous Surface Vehicles [1.7033108359337459]
The proposed control algorithm combines a conventional control method with reinforcement learning to enhance control accuracy and intelligence.
Thanks to reinforcement learning, the overall tracking controller is capable of compensating for model uncertainties and achieving collision avoidance.
arXiv Detail & Related papers (2020-08-17T12:15:15Z) - Safe Reinforcement Learning via Curriculum Induction [94.67835258431202]
In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly.
Existing safe reinforcement learning methods make an agent rely on priors that let it avoid dangerous situations.
This paper presents an alternative approach inspired by human teaching, where an agent learns under the supervision of an automatic instructor.
arXiv Detail & Related papers (2020-06-22T10:48:17Z) - Training Adversarial Agents to Exploit Weaknesses in Deep Control
Policies [47.08581439933752]
We propose an automated black box testing framework based on adversarial reinforcement learning.
We show that the proposed framework is able to find weaknesses in both control policies that were not evident during online testing.
arXiv Detail & Related papers (2020-02-27T13:14:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.