Safer Autonomous Driving in a Stochastic, Partially-Observable
Environment by Hierarchical Contingency Planning
- URL: http://arxiv.org/abs/2204.06509v1
- Date: Wed, 13 Apr 2022 16:47:00 GMT
- Title: Safer Autonomous Driving in a Stochastic, Partially-Observable
Environment by Hierarchical Contingency Planning
- Authors: Ugo Lecerf, Christelle Yemdji-Tchassi, Pietro Michiardi
- Abstract summary: An intelligent agent should be prepared to anticipate a change in its belief of the environment state.
This is especially the case for autonomous vehicles (AVs) navigating real-world situations where safety is paramount.
We show that our approach results in robust, safe behaviour in a partially observable, safe environment, generalizing well over environment not seen during training.
- Score: 10.971411555103574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: When learning to act in a stochastic, partially observable environment, an
intelligent agent should be prepared to anticipate a change in its belief of
the environment state, and be capable of adapting its actions on-the-fly to
changing conditions. As humans, we are able to form contingency plans when
learning a task with the explicit aim of being able to correct errors in the
initial control, and hence prove useful if ever there is a sudden change in our
perception of the environment which requires immediate corrective action. This
is especially the case for autonomous vehicles (AVs) navigating real-world
situations where safety is paramount, and a strong ability to react to a
changing belief about the environment is truly needed.
In this paper we explore an end-to-end approach, from training to execution,
for learning robust contingency plans and combining them with a hierarchical
planner to obtain a robust agent policy in an autonomous navigation task where
other vehicles' behaviours are unknown, and the agent's belief about these
behaviours is subject to sudden, last-second change. We show that our approach
results in robust, safe behaviour in a partially observable, stochastic
environment, generalizing well over environment dynamics not seen during
training.
Related papers
- A Safe Exploration Strategy for Model-free Task Adaptation in Safety-constrained Grid Environments [2.5037136114892267]
In safety-constrained environments, utilizing unsupervised exploration or a non-optimal policy may lead the agent to undesirable states.
We introduce a new exploration framework for navigating the grid environments that enables model-free agents to interact with the environment while adhering to safety constraints.
arXiv Detail & Related papers (2024-08-02T04:09:30Z) - HAZARD Challenge: Embodied Decision Making in Dynamically Changing
Environments [93.94020724735199]
HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind.
This benchmark enables us to evaluate autonomous agents' decision-making capabilities across various pipelines.
arXiv Detail & Related papers (2024-01-23T18:59:43Z) - Dichotomy of Control: Separating What You Can Control from What You
Cannot [129.62135987416164]
We propose a future-conditioned supervised learning framework that separates mechanisms within a policy's control (actions) from those beyond a policy's control (environmentity)
We show that DoC yields policies that are consistent with their conditioning inputs, ensuring that conditioning a learned policy on a desired high-return future outcome will correctly induce high-return behavior.
arXiv Detail & Related papers (2022-10-24T17:49:56Z) - Automatically Learning Fallback Strategies with Model-Free Reinforcement
Learning in Safety-Critical Driving Scenarios [9.761912672523977]
We present a principled approach for a model-free Reinforcement Learning (RL) agent to capture multiple modes of behaviour in an environment.
We introduce an extra pseudo-reward term to the reward model, to encourage exploration to areas of state-space different from areas privileged by the optimal policy.
We show that we are able to learn useful policies that would have otherwise been missed out on during training, and unavailable to use when executing the control algorithm.
arXiv Detail & Related papers (2022-04-11T15:34:49Z) - Information is Power: Intrinsic Control via Information Capture [110.3143711650806]
We argue that a compact and general learning objective is to minimize the entropy of the agent's state visitation estimated using a latent state-space model.
This objective induces an agent to both gather information about its environment, corresponding to reducing uncertainty, and to gain control over its environment, corresponding to reducing the unpredictability of future world states.
arXiv Detail & Related papers (2021-12-07T18:50:42Z) - Unsupervised Domain Adaptation with Dynamics-Aware Rewards in
Reinforcement Learning [28.808933152885874]
Unconditioned reinforcement learning aims to acquire skills without prior goal representations.
The intuitive approach of training in another interaction-rich environment disrupts the trained skills in the target environment.
We propose an unsupervised domain adaptation method to identify and acquire skills across dynamics.
arXiv Detail & Related papers (2021-10-25T14:40:48Z) - Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous.
We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments.
We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z) - Ecological Reinforcement Learning [76.9893572776141]
We study the kinds of environment properties that can make learning under such conditions easier.
understanding how properties of the environment impact the performance of reinforcement learning agents can help us to structure our tasks in ways that make learning tractable.
arXiv Detail & Related papers (2020-06-22T17:55:03Z) - Safe Reinforcement Learning via Curriculum Induction [94.67835258431202]
In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly.
Existing safe reinforcement learning methods make an agent rely on priors that let it avoid dangerous situations.
This paper presents an alternative approach inspired by human teaching, where an agent learns under the supervision of an automatic instructor.
arXiv Detail & Related papers (2020-06-22T10:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.