Related papers: An Online Data-Driven Emergency-Response Method for Autonomous Agents in Unforeseen Situations

An Online Data-Driven Emergency-Response Method for Autonomous Agents in Unforeseen Situations

URL: http://arxiv.org/abs/2112.09670v1
Date: Fri, 17 Dec 2021 18:31:37 GMT
Title: An Online Data-Driven Emergency-Response Method for Autonomous Agents in Unforeseen Situations
Authors: Glenn Maguire, Nicholas Ketz, Praveen Pilly, Jean-Baptiste Mouret
Abstract summary: This paper presents an online, data-driven, emergency-response method. It aims to provide autonomous agents the ability to react to unexpected situations. We demonstrate the potential of this approach in a simulated 3D car driving scenario.
Score: 4.339510167603376
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning agents perform well when presented with inputs within the distribution of those encountered during training. However, they are unable to respond effectively when faced with novel, out-of-distribution events, until they have undergone additional training. This paper presents an online, data-driven, emergency-response method that aims to provide autonomous agents the ability to react to unexpected situations that are very different from those it has been trained or designed to address. In such situations, learned policies cannot be expected to perform appropriately since the observations obtained in these novel situations would fall outside the distribution of inputs that the agent has been optimized to handle. The proposed approach devises a customized response to the unforeseen situation sequentially, by selecting actions that minimize the rate of increase of the reconstruction error from a variational auto-encoder. This optimization is achieved online in a data-efficient manner (on the order of 30 data-points) using a modified Bayesian optimization procedure. We demonstrate the potential of this approach in a simulated 3D car driving scenario, in which the agent devises a response in under 2 seconds to avoid collisions with objects it has not seen during training.

Related papers

Active Test-time Vision-Language Navigation [60.69722522420299]
ATENA is a test-time active learning framework that enables a practical human-robot interaction via episodic feedback on uncertain navigation outcomes.<n>In particular, ATENA learns to increase certainty in successful episodes and decrease it in failed ones, improving uncertainty calibration.<n>In addition, we propose a self-active learning strategy that enables an agent to evaluate its navigation outcomes based on confident predictions.
arXiv Detail & Related papers (2025-06-07T02:24:44Z)
Automatic Curriculum Learning for Driving Scenarios: Towards Robust and Efficient Reinforcement Learning [11.602831593017427]
This paper addresses the challenges of training end-to-end autonomous driving agents using Reinforcement Learning (RL)<n>RL agents are typically trained in a fixed set of scenarios and nominal behavior of surrounding road users in simulations.<n>We propose an automatic curriculum learning framework that dynamically generates driving scenarios with adaptive complexity based on the agent's evolving capabilities.
arXiv Detail & Related papers (2025-05-13T06:26:57Z)
Representation-based Reward Modeling for Efficient Safety Alignment of Large Language Model [84.00480999255628]
Reinforcement Learning algorithms for safety alignment of Large Language Models (LLMs) encounter the challenge of distribution shift. Current approaches typically address this issue through online sampling from the target policy. We propose a new framework that leverages the model's intrinsic safety judgment capability to extract reward signals.
arXiv Detail & Related papers (2025-03-13T06:40:34Z)
Re-thinking Data Availablity Attacks Against Deep Neural Networks [53.64624167867274]
In this paper, we re-examine the concept of unlearnable examples and discern that the existing robust error-minimizing noise presents an inaccurate optimization objective. We introduce a novel optimization paradigm that yields improved protection results with reduced computational time requirements.
arXiv Detail & Related papers (2023-05-18T04:03:51Z)
Let Offline RL Flow: Training Conservative Agents in the Latent Space of Normalizing Flows [58.762959061522736]
offline reinforcement learning aims to train a policy on a pre-recorded and fixed dataset without any additional environment interactions. We build upon recent works on learning policies in latent action spaces and use a special form of Normalizing Flows for constructing a generative model. We evaluate our method on various locomotion and navigation tasks, demonstrating that our approach outperforms recently proposed algorithms.
arXiv Detail & Related papers (2022-11-20T21:57:10Z)
Adaptive Behavior Cloning Regularization for Stable Offline-to-Online Reinforcement Learning [80.25648265273155]
Offline reinforcement learning, by learning from a fixed dataset, makes it possible to learn agent behaviors without interacting with the environment. During online fine-tuning, the performance of the pre-trained agent may collapse quickly due to the sudden distribution shift from offline to online data. We propose to adaptively weigh the behavior cloning loss during online fine-tuning based on the agent's performance and training stability. Experiments show that the proposed method yields state-of-the-art offline-to-online reinforcement learning performance on the popular D4RL benchmark.
arXiv Detail & Related papers (2022-10-25T09:08:26Z)
Dynamic Memory for Interpretable Sequential Optimisation [0.0]
We present a solution to handling non-stationarity that is suitable for deployment at scale. We develop an adaptive Bayesian learning agent that employs a novel form of dynamic memory. We describe the architecture of a large-scale deployment of automatic-as-a-service.
arXiv Detail & Related papers (2022-06-28T12:29:13Z)
Reinforcement Learning in the Wild: Scalable RL Dispatching Algorithm Deployed in Ridehailing Marketplace [12.298997392937876]
This study proposes a real-time dispatching algorithm based on reinforcement learning. It is deployed online in multiple cities under DiDi's operation for A/B testing and is launched in one of the major international markets. The deployed algorithm shows over 1.3% improvement in total driver income from A/B testing.
arXiv Detail & Related papers (2022-02-10T16:07:17Z)
Lifelong Unsupervised Domain Adaptive Person Re-identification with Coordinated Anti-forgetting and Adaptation [127.6168183074427]
We propose a new task, Lifelong Unsupervised Domain Adaptive (LUDA) person ReID. This is challenging because it requires the model to continuously adapt to unlabeled data of the target environments. We design an effective scheme for this task, dubbed CLUDA-ReID, where the anti-forgetting is harmoniously coordinated with the adaptation.
arXiv Detail & Related papers (2021-12-13T13:19:45Z)
UMBRELLA: Uncertainty-Aware Model-Based Offline Reinforcement Learning Leveraging Planning [1.1339580074756188]
Offline reinforcement learning (RL) provides a framework for learning decision-making from offline data. Self-driving vehicles (SDV) learn a policy, which potentially even outperforms the behavior in the sub-optimal data set. This motivates the use of model-based offline RL approaches, which leverage planning.
arXiv Detail & Related papers (2021-11-22T10:37:52Z)
Offline-to-Online Reinforcement Learning via Balanced Replay and Pessimistic Q-Ensemble [135.6115462399788]
Deep offline reinforcement learning has made it possible to train strong robotic agents from offline datasets. State-action distribution shift may lead to severe bootstrap error during fine-tuning. We propose a balanced replay scheme that prioritizes samples encountered online while also encouraging the use of near-on-policy samples.
arXiv Detail & Related papers (2021-07-01T16:26:54Z)
Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts? [104.04999499189402]
Out-of-training-distribution (OOD) scenarios are a common challenge of learning agents at deployment. We propose an uncertainty-aware planning method, called emphrobust imitative planning (RIP) Our method can detect and recover from some distribution shifts, reducing the overconfident and catastrophic extrapolations in OOD scenes. We introduce an autonomous car novel-scene benchmark, textttCARNOVEL, to evaluate the robustness of driving agents to a suite of tasks with distribution shifts.
arXiv Detail & Related papers (2020-06-26T11:07:32Z)
Tactical Decision-Making in Autonomous Driving by Reinforcement Learning with Uncertainty Estimation [0.9883261192383611]
Reinforcement learning can be used to create a tactical decision-making agent for autonomous driving. This paper investigates how a Bayesian RL technique can be used to estimate the uncertainty of decisions in autonomous driving.
arXiv Detail & Related papers (2020-04-22T08:22:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.