Related papers: Active Learning with Safety Constraints

Active Learning with Safety Constraints

URL: http://arxiv.org/abs/2206.11183v1
Date: Wed, 22 Jun 2022 15:45:38 GMT
Title: Active Learning with Safety Constraints
Authors: Romain Camilleri, Andrew Wagenmaker, Jamie Morgenstern, Lalit Jain, Kevin Jamieson
Abstract summary: We investigate the complexity of learning the best safe decision in interactive environments. We propose an adaptive experimental design-based algorithm, which we show efficiently trades off between the difficulty of showing an arm is unsafe vs suboptimal.
Score: 25.258564629480063
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Active learning methods have shown great promise in reducing the number of samples necessary for learning. As automated learning systems are adopted into real-time, real-world decision-making pipelines, it is increasingly important that such algorithms are designed with safety in mind. In this work we investigate the complexity of learning the best safe decision in interactive environments. We reduce this problem to a constrained linear bandits problem, where our goal is to find the best arm satisfying certain (unknown) safety constraints. We propose an adaptive experimental design-based algorithm, which we show efficiently trades off between the difficulty of showing an arm is unsafe vs suboptimal. To our knowledge, our results are the first on best-arm identification in linear bandits with safety constraints. In practice, we demonstrate that this approach performs well on synthetic and real world datasets.

Related papers

Amortized Safe Active Learning for Real-Time Decision-Making: Pretrained Neural Policies from Simulated Nonparametric Functions [23.406516455945653]
Active Learning (AL) is a sequential learning approach aiming at selecting the most informative data for model training. Key challenges of AL are the repeated model training and acquisition optimization required for data selection. By leveraging a pretrained neural network policy, our method eliminates the need for repeated model training and acquisition optimization.
arXiv Detail & Related papers (2025-01-26T09:05:52Z)
Approximate Shielding of Atari Agents for Safe Exploration [83.55437924143615]
We propose a principled algorithm for safe exploration based on the concept of shielding. We present preliminary results that show our approximate shielding algorithm effectively reduces the rate of safety violations.
arXiv Detail & Related papers (2023-04-21T16:19:54Z)
Evaluating Model-free Reinforcement Learning toward Safety-critical Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL. We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection. To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z)
Log Barriers for Safe Black-box Optimization with Application to Safe Reinforcement Learning [72.97229770329214]
We introduce a general approach for seeking high dimensional non-linear optimization problems in which maintaining safety during learning is crucial. Our approach called LBSGD is based on applying a logarithmic barrier approximation with a carefully chosen step size. We demonstrate the effectiveness of our approach on minimizing violation in policy tasks in safe reinforcement learning.
arXiv Detail & Related papers (2022-07-21T11:14:47Z)
Interactively Learning Preference Constraints in Linear Bandits [100.78514640066565]
We study sequential decision-making with known rewards and unknown constraints. As an application, we consider learning constraints to represent human preferences in a driving simulation.
arXiv Detail & Related papers (2022-06-10T17:52:58Z)
Safety and Liveness Guarantees through Reach-Avoid Reinforcement Learning [24.56889192688925]
Reach-avoid optimal control problems are central to safety and liveness assurance for autonomous robotic systems. Recent successes in reinforcement learning methods to approximately solve optimal control problems with performance objectives make their application to certification problems attractive. Recent work has shown promise in extending the reinforcement learning machinery to handle safety-type problems, whose objective is not a sum, but a minimum (or maximum) over time.
arXiv Detail & Related papers (2021-12-23T00:44:38Z)
Best Arm Identification with Safety Constraints [3.7783523378336112]
The best arm identification problem in the multi-armed bandit setting is an excellent model of many real-world decision-making problems. We study the question of best-arm identification in safety-critical settings, where the goal of the agent is to find the best safe option out of many. We propose an algorithm in this setting which is guaranteed to learn safely.
arXiv Detail & Related papers (2021-11-23T20:53:12Z)
Efficient falsification approach for autonomous vehicle validation using a parameter optimisation technique based on reinforcement learning [6.198523595657983]
The widescale deployment of Autonomous Vehicles (AV) appears to be imminent despite many safety challenges that are yet to be resolved. The uncertainties in the behaviour of the traffic participants and the dynamic world cause reactions in advanced autonomous systems. This paper presents an efficient falsification method to evaluate the System Under Test.
arXiv Detail & Related papers (2020-11-16T02:56:13Z)
Guided Uncertainty-Aware Policy Optimization: Combining Learning and Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state. reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle. In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z)
Chance-Constrained Trajectory Optimization for Safe Exploration and Learning of Nonlinear Systems [81.7983463275447]
Learning-based control algorithms require data collection with abundant supervision for training. We present a new approach for optimal motion planning with safe exploration that integrates chance-constrained optimal control with dynamics learning and feedback control.
arXiv Detail & Related papers (2020-05-09T05:57:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.