DIAL: Distribution-Informed Adaptive Learning of Multi-Task Constraints for Safety-Critical Systems
- URL: http://arxiv.org/abs/2501.18086v1
- Date: Thu, 30 Jan 2025 01:56:07 GMT
- Title: DIAL: Distribution-Informed Adaptive Learning of Multi-Task Constraints for Safety-Critical Systems
- Authors: Se-Wook Yoo, Seung-Woo Seo,
- Abstract summary: predefined constraint functions to ensure safety in complex real-world tasks, such as autonomous driving.
Recent research highlights the potential of leveraging pre-acquired task-agnostic knowledge to enhance both safety and sample efficiency in related tasks.
We propose a novel method to learn shared constraint distributions across multiple tasks.
Our approach identifies the shared constraints through imitation learning and then adapts to new tasks by adjusting risk levels within these learned distributions.
- Score: 13.93024489228903
- License:
- Abstract: Safe reinforcement learning has traditionally relied on predefined constraint functions to ensure safety in complex real-world tasks, such as autonomous driving. However, defining these functions accurately for varied tasks is a persistent challenge. Recent research highlights the potential of leveraging pre-acquired task-agnostic knowledge to enhance both safety and sample efficiency in related tasks. Building on this insight, we propose a novel method to learn shared constraint distributions across multiple tasks. Our approach identifies the shared constraints through imitation learning and then adapts to new tasks by adjusting risk levels within these learned distributions. This adaptability addresses variations in risk sensitivity stemming from expert-specific biases, ensuring consistent adherence to general safety principles even with imperfect demonstrations. Our method can be applied to control and navigation domains, including multi-task and meta-task scenarios, accommodating constraints such as maintaining safe distances or adhering to speed limits. Experimental results validate the efficacy of our approach, demonstrating superior safety performance and success rates compared to baselines, all without requiring task-specific constraint definitions. These findings underscore the versatility and practicality of our method across a wide range of real-world tasks.
Related papers
- Task-Aware Harmony Multi-Task Decision Transformer for Offline Reinforcement Learning [70.96345405979179]
The purpose of offline multi-task reinforcement learning (MTRL) is to develop a unified policy applicable to diverse tasks without the need for online environmental interaction.
variations in task content and complexity pose significant challenges in policy formulation.
We introduce the Harmony Multi-Task Decision Transformer (HarmoDT), a novel solution designed to identify an optimal harmony subspace of parameters for each task.
arXiv Detail & Related papers (2024-11-02T05:49:14Z) - Active Fine-Tuning of Generalist Policies [54.65568433408307]
We propose AMF (Active Multi-task Fine-tuning) to maximize multi-task policy performance under a limited demonstration budget.
We derive performance guarantees for AMF under regularity assumptions and demonstrate its empirical effectiveness in complex and high-dimensional environments.
arXiv Detail & Related papers (2024-10-07T13:26:36Z) - A CMDP-within-online framework for Meta-Safe Reinforcement Learning [23.57318558833378]
We study the problem of meta-safe reinforcement learning (Meta-SRL) through the CMDP-within-online framework.
We obtain task-averaged regret bounds for unseen (optimality gap) and constraint violations using gradient-based meta-learning.
We propose a meta-algorithm that performs inexact online learning on the upper bounds of within-task optimality gap and constraint violations.
arXiv Detail & Related papers (2024-05-26T15:28:42Z) - Safe and Balanced: A Framework for Constrained Multi-Objective Reinforcement Learning [26.244121960815907]
We propose a primal-based framework that orchestrates policy optimization between multi-objective learning and constraint adherence.
Our method employs a novel natural policy gradient manipulation method to optimize multiple RL objectives.
Empirically, our proposed method also outperforms prior state-of-the-art methods on challenging safe multi-objective reinforcement learning tasks.
arXiv Detail & Related papers (2024-05-26T00:42:10Z) - Towards Interpretable Reinforcement Learning with Constrained Normalizing Flow Policies [5.6872893893453105]
Reinforcement learning policies are typically represented by black-box neural networks.
We propose constrained normalizing flow policies as interpretable and safe-by-construction policy models.
arXiv Detail & Related papers (2024-05-02T11:40:15Z) - Uniformly Safe RL with Objective Suppression for Multi-Constraint Safety-Critical Applications [73.58451824894568]
The widely adopted CMDP model constrains the risks in expectation, which makes room for dangerous behaviors in long-tail states.
In safety-critical domains, such behaviors could lead to disastrous outcomes.
We propose Objective Suppression, a novel method that adaptively suppresses the task reward maximizing objectives according to a safety critic.
arXiv Detail & Related papers (2024-02-23T23:22:06Z) - Constrained Meta-Reinforcement Learning for Adaptable Safety Guarantee
with Differentiable Convex Programming [4.825619788907192]
This paper studies the unique challenges of ensuring safety in non-stationary environments by solving constrained problems through the lens of the meta-learning approach (learning-to-learn)
We first employ successive convex-constrained policy updates across multiple tasks with differentiable convexprogramming, which allows meta-learning in constrained scenarios by enabling end-to-end differentiation.
arXiv Detail & Related papers (2023-12-15T21:55:43Z) - Learning Options via Compression [62.55893046218824]
We propose a new objective that combines the maximum likelihood objective with a penalty on the description length of the skills.
Our objective learns skills that solve downstream tasks in fewer samples compared to skills learned from only maximizing likelihood.
arXiv Detail & Related papers (2022-12-08T22:34:59Z) - Variational Multi-Task Learning with Gumbel-Softmax Priors [105.22406384964144]
Multi-task learning aims to explore task relatedness to improve individual tasks.
We propose variational multi-task learning (VMTL), a general probabilistic inference framework for learning multiple related tasks.
arXiv Detail & Related papers (2021-11-09T18:49:45Z) - Learning to Be Cautious [71.9871661858886]
A key challenge in the field of reinforcement learning is to develop agents that behave cautiously in novel situations.
We present a sequence of tasks where cautious behavior becomes increasingly non-obvious, as well as an algorithm to demonstrate that it is possible for a system to emphlearn to be cautious.
arXiv Detail & Related papers (2021-10-29T16:52:45Z) - Safety Aware Reinforcement Learning (SARL) [4.4617911035181095]
We focus on researching scenarios where agents can cause undesired side effects while executing a policy on a primary task.
Since one can define multiple tasks for a given environment dynamics, there are two important challenges.
We propose Safety Aware Reinforcement Learning (SARL) - a framework where a virtual safe agent modulates the actions of a main reward-based agent to minimize side effects.
arXiv Detail & Related papers (2020-10-06T16:08:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.