Learning Safety Constraints From Demonstration Using One-Class Decision
Trees
- URL: http://arxiv.org/abs/2312.08837v1
- Date: Thu, 14 Dec 2023 11:48:22 GMT
- Title: Learning Safety Constraints From Demonstration Using One-Class Decision
Trees
- Authors: Mattijs Baert, Sam Leroux, Pieter Simoens
- Abstract summary: We present a novel approach that leverages one-class decision trees to facilitate learning from expert demonstrations.
The learned constraints are subsequently employed within an oracle constrained reinforcement learning framework.
In contrast to other methods, our approach offers an interpretable representation of the constraints, a vital feature in safety-critical environments.
- Score: 1.81343777902022
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The alignment of autonomous agents with human values is a pivotal challenge
when deploying these agents within physical environments, where safety is an
important concern. However, defining the agent's objective as a reward and/or
cost function is inherently complex and prone to human errors. In response to
this challenge, we present a novel approach that leverages one-class decision
trees to facilitate learning from expert demonstrations. These decision trees
provide a foundation for representing a set of constraints pertinent to the
given environment as a logical formula in disjunctive normal form. The learned
constraints are subsequently employed within an oracle constrained
reinforcement learning framework, enabling the acquisition of a safe policy. In
contrast to other methods, our approach offers an interpretable representation
of the constraints, a vital feature in safety-critical environments. To
validate the effectiveness of our proposed method, we conduct experiments in
synthetic benchmark domains and a realistic driving environment.
Related papers
- Feasibility Consistent Representation Learning for Safe Reinforcement Learning [25.258227763316228]
We introduce a novel framework named Feasibility Consistent Safe Reinforcement Learning (FCSRL)
This framework combines representation learning with feasibility-oriented objectives to identify and extract safety-related information from the raw state for safe RL.
Our method is capable of learning a better safety-aware embedding and achieving superior performance than previous representation learning baselines.
arXiv Detail & Related papers (2024-05-20T01:37:21Z) - HAZARD Challenge: Embodied Decision Making in Dynamically Changing
Environments [93.94020724735199]
HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind.
This benchmark enables us to evaluate autonomous agents' decision-making capabilities across various pipelines.
arXiv Detail & Related papers (2024-01-23T18:59:43Z) - Constrained Meta-Reinforcement Learning for Adaptable Safety Guarantee
with Differentiable Convex Programming [4.825619788907192]
This paper studies the unique challenges of ensuring safety in non-stationary environments by solving constrained problems through the lens of the meta-learning approach (learning-to-learn)
We first employ successive convex-constrained policy updates across multiple tasks with differentiable convexprogramming, which allows meta-learning in constrained scenarios by enabling end-to-end differentiation.
arXiv Detail & Related papers (2023-12-15T21:55:43Z) - Risk-Aware Continuous Control with Neural Contextual Bandits [8.911816419902427]
We propose a risk-aware decision-making framework for contextual bandit problems.
Our framework is designed to cater to various risk levels, effectively balancing constraint satisfaction against performance.
We evaluate our framework in a real-world use case involving a 5G mobile network.
arXiv Detail & Related papers (2023-12-15T17:16:04Z) - Maximum Causal Entropy Inverse Constrained Reinforcement Learning [3.409089945290584]
We propose a novel method that utilizes the principle of maximum causal entropy to learn constraints and an optimal policy.
We evaluate the effectiveness of the learned policy by assessing the reward received and the number of constraint violations.
Our method has been shown to outperform state-of-the-art approaches across a variety of tasks and environments.
arXiv Detail & Related papers (2023-05-04T14:18:19Z) - Approximate Shielding of Atari Agents for Safe Exploration [83.55437924143615]
We propose a principled algorithm for safe exploration based on the concept of shielding.
We present preliminary results that show our approximate shielding algorithm effectively reduces the rate of safety violations.
arXiv Detail & Related papers (2023-04-21T16:19:54Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - Risk-Averse Model Uncertainty for Distributionally Robust Safe
Reinforcement Learning [3.9821399546174825]
We introduce a deep reinforcement learning framework for safe decision making in uncertain environments.
We provide robustness guarantees for this framework by showing it is equivalent to a specific class of distributionally robust safe reinforcement learning problems.
In experiments on continuous control tasks with safety constraints, we demonstrate that our framework produces robust performance and safety at deployment time across a range of perturbed test environments.
arXiv Detail & Related papers (2023-01-30T00:37:06Z) - Evaluating Model-free Reinforcement Learning toward Safety-critical
Tasks [70.76757529955577]
This paper revisits prior work in this scope from the perspective of state-wise safe RL.
We propose Unrolling Safety Layer (USL), a joint method that combines safety optimization and safety projection.
To facilitate further research in this area, we reproduce related algorithms in a unified pipeline and incorporate them into SafeRL-Kit.
arXiv Detail & Related papers (2022-12-12T06:30:17Z) - Guided Uncertainty-Aware Policy Optimization: Combining Learning and
Model-Based Strategies for Sample-Efficient Policy Learning [75.56839075060819]
Traditional robotic approaches rely on an accurate model of the environment, a detailed description of how to perform the task, and a robust perception system to keep track of the current state.
reinforcement learning approaches can operate directly from raw sensory inputs with only a reward signal to describe the task, but are extremely sample-inefficient and brittle.
In this work, we combine the strengths of model-based methods with the flexibility of learning-based methods to obtain a general method that is able to overcome inaccuracies in the robotics perception/actuation pipeline.
arXiv Detail & Related papers (2020-05-21T19:47:05Z) - Learning Control Barrier Functions from Expert Demonstrations [69.23675822701357]
We propose a learning based approach to safe controller synthesis based on control barrier functions (CBFs)
We analyze an optimization-based approach to learning a CBF that enjoys provable safety guarantees under suitable Lipschitz assumptions on the underlying dynamical system.
To the best of our knowledge, these are the first results that learn provably safe control barrier functions from data.
arXiv Detail & Related papers (2020-04-07T12:29:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.