Assured Learning-enabled Autonomy: A Metacognitive Reinforcement
Learning Framework
- URL: http://arxiv.org/abs/2103.12558v1
- Date: Tue, 23 Mar 2021 14:01:35 GMT
- Title: Assured Learning-enabled Autonomy: A Metacognitive Reinforcement
Learning Framework
- Authors: Aquib Mustafa, Majid Mazouchi, Subramanya Nageshrao, Hamidreza Modares
- Abstract summary: Reinforcement learning (RL) agents with pre-specified reward functions cannot provide guaranteed safety across variety of circumstances.
An assured autonomous control framework is presented in this paper by empowering RL algorithms with metacognitive learning capabilities.
- Score: 4.427447378048202
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning (RL) agents with pre-specified reward functions cannot
provide guaranteed safety across variety of circumstances that an uncertain
system might encounter. To guarantee performance while assuring satisfaction of
safety constraints across variety of circumstances, an assured autonomous
control framework is presented in this paper by empowering RL algorithms with
metacognitive learning capabilities. More specifically, adapting the reward
function parameters of the RL agent is performed in a metacognitive
decision-making layer to assure the feasibility of RL agent. That is, to assure
that the learned policy by the RL agent satisfies safety constraints specified
by signal temporal logic while achieving as much performance as possible. The
metacognitive layer monitors any possible future safety violation under the
actions of the RL agent and employs a higher-layer Bayesian RL algorithm to
proactively adapt the reward function for the lower-layer RL agent. To minimize
the higher-layer Bayesian RL intervention, a fitness function is leveraged by
the metacognitive layer as a metric to evaluate success of the lower-layer RL
agent in satisfaction of safety and liveness specifications, and the
higher-layer Bayesian RL intervenes only if there is a risk of lower-layer RL
failure. Finally, a simulation example is provided to validate the
effectiveness of the proposed approach.
Related papers
- Approximate Model-Based Shielding for Safe Reinforcement Learning [83.55437924143615]
We propose a principled look-ahead shielding algorithm for verifying the performance of learned RL policies.
Our algorithm differs from other shielding approaches in that it does not require prior knowledge of the safety-relevant dynamics of the system.
We demonstrate superior performance to other safety-aware approaches on a set of Atari games with state-dependent safety-labels.
arXiv Detail & Related papers (2023-07-27T15:19:45Z) - A Multiplicative Value Function for Safe and Efficient Reinforcement
Learning [131.96501469927733]
We propose a safe model-free RL algorithm with a novel multiplicative value function consisting of a safety critic and a reward critic.
The safety critic predicts the probability of constraint violation and discounts the reward critic that only estimates constraint-free returns.
We evaluate our method in four safety-focused environments, including classical RL benchmarks augmented with safety constraints and robot navigation tasks with images and raw Lidar scans as observations.
arXiv Detail & Related papers (2023-03-07T18:29:15Z) - Safety Correction from Baseline: Towards the Risk-aware Policy in
Robotics via Dual-agent Reinforcement Learning [64.11013095004786]
We propose a dual-agent safe reinforcement learning strategy consisting of a baseline and a safe agent.
Such a decoupled framework enables high flexibility, data efficiency and risk-awareness for RL-based control.
The proposed method outperforms the state-of-the-art safe RL algorithms on difficult robot locomotion and manipulation tasks.
arXiv Detail & Related papers (2022-12-14T03:11:25Z) - Flexible Attention-Based Multi-Policy Fusion for Efficient Deep
Reinforcement Learning [78.31888150539258]
Reinforcement learning (RL) agents have long sought to approach the efficiency of human learning.
Prior studies in RL have incorporated external knowledge policies to help agents improve sample efficiency.
We present Knowledge-Grounded RL (KGRL), an RL paradigm fusing multiple knowledge policies and aiming for human-like efficiency and flexibility.
arXiv Detail & Related papers (2022-10-07T17:56:57Z) - Safe reinforcement learning for multi-energy management systems with
known constraint functions [0.0]
Reinforcement learning (RL) is a promising optimal control technique for multi-energy management systems.
We present two novel safe RL methods, namely SafeFallback and GiveSafe.
In a simulated multi-energy systems case study we have shown that both methods start with a significantly higher utility.
arXiv Detail & Related papers (2022-07-08T11:33:53Z) - Provably Safe Reinforcement Learning: Conceptual Analysis, Survey, and
Benchmarking [12.719948223824483]
reinforcement learning (RL) algorithms are crucial to unlock their potential for many real-world tasks.
However, vanilla RL and most safe RL approaches do not guarantee safety.
We introduce a categorization of existing provably safe RL methods, present the conceptual foundations for both continuous and discrete action spaces, and empirically benchmark existing methods.
We provide practical guidance on selecting provably safe RL approaches depending on the safety specification, RL algorithm, and type of action space.
arXiv Detail & Related papers (2022-05-13T16:34:36Z) - Lyapunov-based uncertainty-aware safe reinforcement learning [0.0]
InReinforcement learning (RL) has shown a promising performance in learning optimal policies for a variety of sequential decision-making tasks.
In many real-world RL problems, besides optimizing the main objectives, the agent is expected to satisfy a certain level of safety.
We propose a Lyapunov-based uncertainty-aware safe RL model to address these limitations.
arXiv Detail & Related papers (2021-07-29T13:08:15Z) - Combining Pessimism with Optimism for Robust and Efficient Model-Based
Deep Reinforcement Learning [56.17667147101263]
In real-world tasks, reinforcement learning agents encounter situations that are not present during training time.
To ensure reliable performance, the RL agents need to exhibit robustness against worst-case situations.
We propose the Robust Hallucinated Upper-Confidence RL (RH-UCRL) algorithm to provably solve this problem.
arXiv Detail & Related papers (2021-03-18T16:50:17Z) - Safe Distributional Reinforcement Learning [19.607668635077495]
Safety in reinforcement learning (RL) is a key property in both training and execution in many domains such as autonomous driving or finance.
We formalize it with a constrained RL formulation in the distributional RL setting.
We empirically validate our propositions on artificial and real domains against appropriate state-of-the-art safe RL algorithms.
arXiv Detail & Related papers (2021-02-26T13:03:27Z) - Cautious Reinforcement Learning with Logical Constraints [78.96597639789279]
An adaptive safe padding forces Reinforcement Learning (RL) to synthesise optimal control policies while ensuring safety during the learning process.
Theoretical guarantees are available on the optimality of the synthesised policies and on the convergence of the learning algorithm.
arXiv Detail & Related papers (2020-02-26T00:01:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.