An Online Data-Driven Emergency-Response Method for Autonomous Agents in
Unforeseen Situations
- URL: http://arxiv.org/abs/2112.09670v1
- Date: Fri, 17 Dec 2021 18:31:37 GMT
- Title: An Online Data-Driven Emergency-Response Method for Autonomous Agents in
Unforeseen Situations
- Authors: Glenn Maguire, Nicholas Ketz, Praveen Pilly, Jean-Baptiste Mouret
- Abstract summary: This paper presents an online, data-driven, emergency-response method.
It aims to provide autonomous agents the ability to react to unexpected situations.
We demonstrate the potential of this approach in a simulated 3D car driving scenario.
- Score: 4.339510167603376
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning agents perform well when presented with inputs within
the distribution of those encountered during training. However, they are unable
to respond effectively when faced with novel, out-of-distribution events, until
they have undergone additional training. This paper presents an online,
data-driven, emergency-response method that aims to provide autonomous agents
the ability to react to unexpected situations that are very different from
those it has been trained or designed to address. In such situations, learned
policies cannot be expected to perform appropriately since the observations
obtained in these novel situations would fall outside the distribution of
inputs that the agent has been optimized to handle. The proposed approach
devises a customized response to the unforeseen situation sequentially, by
selecting actions that minimize the rate of increase of the reconstruction
error from a variational auto-encoder. This optimization is achieved online in
a data-efficient manner (on the order of 30 data-points) using a modified
Bayesian optimization procedure. We demonstrate the potential of this approach
in a simulated 3D car driving scenario, in which the agent devises a response
in under 2 seconds to avoid collisions with objects it has not seen during
training.
Related papers
- Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model [84.00480999255628]
Reinforcement Learning algorithms for safety alignment of Large Language Models (LLMs) encounter the challenge of distribution shift.
Current approaches typically address this issue through online sampling from the target policy.
We propose a new framework that leverages the model's intrinsic safety judgment capability to extract reward signals.
arXiv Detail & Related papers (2025-03-13T06:40:34Z) - Re-thinking Data Availablity Attacks Against Deep Neural Networks [53.64624167867274]
In this paper, we re-examine the concept of unlearnable examples and discern that the existing robust error-minimizing noise presents an inaccurate optimization objective.
We introduce a novel optimization paradigm that yields improved protection results with reduced computational time requirements.
arXiv Detail & Related papers (2023-05-18T04:03:51Z) - Let Offline RL Flow: Training Conservative Agents in the Latent Space of
Normalizing Flows [58.762959061522736]
offline reinforcement learning aims to train a policy on a pre-recorded and fixed dataset without any additional environment interactions.
We build upon recent works on learning policies in latent action spaces and use a special form of Normalizing Flows for constructing a generative model.
We evaluate our method on various locomotion and navigation tasks, demonstrating that our approach outperforms recently proposed algorithms.
arXiv Detail & Related papers (2022-11-20T21:57:10Z) - Adaptive Behavior Cloning Regularization for Stable Offline-to-Online
Reinforcement Learning [80.25648265273155]
Offline reinforcement learning, by learning from a fixed dataset, makes it possible to learn agent behaviors without interacting with the environment.
During online fine-tuning, the performance of the pre-trained agent may collapse quickly due to the sudden distribution shift from offline to online data.
We propose to adaptively weigh the behavior cloning loss during online fine-tuning based on the agent's performance and training stability.
Experiments show that the proposed method yields state-of-the-art offline-to-online reinforcement learning performance on the popular D4RL benchmark.
arXiv Detail & Related papers (2022-10-25T09:08:26Z) - Dynamic Memory for Interpretable Sequential Optimisation [0.0]
We present a solution to handling non-stationarity that is suitable for deployment at scale.
We develop an adaptive Bayesian learning agent that employs a novel form of dynamic memory.
We describe the architecture of a large-scale deployment of automatic-as-a-service.
arXiv Detail & Related papers (2022-06-28T12:29:13Z) - Reinforcement Learning in the Wild: Scalable RL Dispatching Algorithm
Deployed in Ridehailing Marketplace [12.298997392937876]
This study proposes a real-time dispatching algorithm based on reinforcement learning.
It is deployed online in multiple cities under DiDi's operation for A/B testing and is launched in one of the major international markets.
The deployed algorithm shows over 1.3% improvement in total driver income from A/B testing.
arXiv Detail & Related papers (2022-02-10T16:07:17Z) - Lifelong Unsupervised Domain Adaptive Person Re-identification with
Coordinated Anti-forgetting and Adaptation [127.6168183074427]
We propose a new task, Lifelong Unsupervised Domain Adaptive (LUDA) person ReID.
This is challenging because it requires the model to continuously adapt to unlabeled data of the target environments.
We design an effective scheme for this task, dubbed CLUDA-ReID, where the anti-forgetting is harmoniously coordinated with the adaptation.
arXiv Detail & Related papers (2021-12-13T13:19:45Z) - UMBRELLA: Uncertainty-Aware Model-Based Offline Reinforcement Learning
Leveraging Planning [1.1339580074756188]
Offline reinforcement learning (RL) provides a framework for learning decision-making from offline data.
Self-driving vehicles (SDV) learn a policy, which potentially even outperforms the behavior in the sub-optimal data set.
This motivates the use of model-based offline RL approaches, which leverage planning.
arXiv Detail & Related papers (2021-11-22T10:37:52Z) - Offline-to-Online Reinforcement Learning via Balanced Replay and
Pessimistic Q-Ensemble [135.6115462399788]
Deep offline reinforcement learning has made it possible to train strong robotic agents from offline datasets.
State-action distribution shift may lead to severe bootstrap error during fine-tuning.
We propose a balanced replay scheme that prioritizes samples encountered online while also encouraging the use of near-on-policy samples.
arXiv Detail & Related papers (2021-07-01T16:26:54Z) - Can Autonomous Vehicles Identify, Recover From, and Adapt to
Distribution Shifts? [104.04999499189402]
Out-of-training-distribution (OOD) scenarios are a common challenge of learning agents at deployment.
We propose an uncertainty-aware planning method, called emphrobust imitative planning (RIP)
Our method can detect and recover from some distribution shifts, reducing the overconfident and catastrophic extrapolations in OOD scenes.
We introduce an autonomous car novel-scene benchmark, textttCARNOVEL, to evaluate the robustness of driving agents to a suite of tasks with distribution shifts.
arXiv Detail & Related papers (2020-06-26T11:07:32Z) - Tactical Decision-Making in Autonomous Driving by Reinforcement Learning
with Uncertainty Estimation [0.9883261192383611]
Reinforcement learning can be used to create a tactical decision-making agent for autonomous driving.
This paper investigates how a Bayesian RL technique can be used to estimate the uncertainty of decisions in autonomous driving.
arXiv Detail & Related papers (2020-04-22T08:22:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.