Explaining Online Reinforcement Learning Decisions of Self-Adaptive
Systems
- URL: http://arxiv.org/abs/2210.05931v1
- Date: Wed, 12 Oct 2022 05:38:27 GMT
- Title: Explaining Online Reinforcement Learning Decisions of Self-Adaptive
Systems
- Authors: Felix Feit and Andreas Metzger and Klaus Pohl
- Abstract summary: Design time uncertainty poses an important challenge when developing a self-adaptive system.
Online reinforcement learning is an emerging approach to realizing self-adaptive systems in the presence of design time uncertainty.
Deep RL represents learned knowledge as a neural network whereby it can generalize over unseen inputs.
- Score: 0.90238471756546
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Design time uncertainty poses an important challenge when developing a
self-adaptive system. As an example, defining how the system should adapt when
facing a new environment state, requires understanding the precise effect of an
adaptation, which may not be known at design time. Online reinforcement
learning, i.e., employing reinforcement learning (RL) at runtime, is an
emerging approach to realizing self-adaptive systems in the presence of design
time uncertainty. By using Online RL, the self-adaptive system can learn from
actual operational data and leverage feedback only available at runtime.
Recently, Deep RL is gaining interest. Deep RL represents learned knowledge as
a neural network whereby it can generalize over unseen inputs, as well as
handle continuous environment states and adaptation actions. A fundamental
problem of Deep RL is that learned knowledge is not explicitly represented. For
a human, it is practically impossible to relate the parametrization of the
neural network to concrete RL decisions and thus Deep RL essentially appears as
a black box. Yet, understanding the decisions made by Deep RL is key to (1)
increasing trust, and (2) facilitating debugging. Such debugging is especially
relevant for self-adaptive systems, because the reward function, which
quantifies the feedback to the RL algorithm, must be defined by developers. The
reward function must be explicitly defined by developers, thus introducing a
potential for human error. To explain Deep RL for self-adaptive systems, we
enhance and combine two existing explainable RL techniques from the machine
learning literature. The combined technique, XRL-DINE, overcomes the respective
limitations of the individual techniques. We present a proof-of-concept
implementation of XRL-DINE, as well as qualitative and quantitative results of
applying XRL-DINE to a self-adaptive system exemplar.
Related papers
- Is Value Learning Really the Main Bottleneck in Offline RL? [70.54708989409409]
We show that the choice of a policy extraction algorithm significantly affects the performance and scalability of offline RL.
We propose two simple test-time policy improvement methods and show that these methods lead to better performance.
arXiv Detail & Related papers (2024-06-13T17:07:49Z) - A User Study on Explainable Online Reinforcement Learning for Adaptive
Systems [0.802904964931021]
Online reinforcement learning (RL) is increasingly used for realizing adaptive systems in the presence of design time uncertainty.
Deep RL gaining interest, the learned knowledge is no longer explicitly represented, but is represented as a neural network.
XRL-DINE provides visual insights into why certain decisions were made at important time points.
arXiv Detail & Related papers (2023-07-09T05:12:42Z) - A Survey on Explainable Reinforcement Learning: Concepts, Algorithms,
Challenges [38.70863329476517]
Reinforcement Learning (RL) is a popular machine learning paradigm where intelligent agents interact with the environment to fulfill a long-term goal.
Despite the encouraging results achieved, the deep neural network-based backbone is widely deemed as a black box that impedes practitioners to trust and employ trained agents in realistic scenarios where high security and reliability are essential.
To alleviate this issue, a large volume of literature devoted to shedding light on the inner workings of the intelligent agents has been proposed, by constructing intrinsic interpretability or post-hoc explainability.
arXiv Detail & Related papers (2022-11-12T13:52:06Z) - Entropy Regularized Reinforcement Learning with Cascading Networks [9.973226671536041]
Deep RL uses neural networks as function approximators.
One of the major difficulties of RL is the absence of i.i.d. data.
In this work, we challenge the common practices of the (un)supervised learning community of using a fixed neural architecture.
arXiv Detail & Related papers (2022-10-16T10:28:59Z) - Automated Reinforcement Learning (AutoRL): A Survey and Open Problems [92.73407630874841]
Automated Reinforcement Learning (AutoRL) involves not only standard applications of AutoML but also includes additional challenges unique to RL.
We provide a common taxonomy, discuss each area in detail and pose open problems which would be of interest to researchers going forward.
arXiv Detail & Related papers (2022-01-11T12:41:43Z) - RvS: What is Essential for Offline RL via Supervised Learning? [77.91045677562802]
Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.
In every environment suite we consider simply maximizing likelihood with two-layer feedforward is competitive.
They also probe the limits of existing RvS methods, which are comparatively weak on random data.
arXiv Detail & Related papers (2021-12-20T18:55:16Z) - A Workflow for Offline Model-Free Robotic Reinforcement Learning [117.07743713715291]
offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction.
We develop a practical workflow for using offline RL analogous to the relatively well-understood for supervised learning problems.
We demonstrate the efficacy of this workflow in producing effective policies without any online tuning.
arXiv Detail & Related papers (2021-09-22T16:03:29Z) - Heuristic-Guided Reinforcement Learning [31.056460162389783]
Tabula rasa RL algorithms require environment interactions or computation that scales with the horizon of the decision-making task.
Our framework can be viewed as a horizon-based regularization for controlling bias and variance in RL under a finite interaction budget.
In particular, we introduce the novel concept of an "improvable" -- a that allows an RL agent to extrapolate beyond its prior knowledge.
arXiv Detail & Related papers (2021-06-05T00:04:09Z) - Explainability in Deep Reinforcement Learning [68.8204255655161]
We review recent works in the direction to attain Explainable Reinforcement Learning (XRL)
In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box.
arXiv Detail & Related papers (2020-08-15T10:11:42Z) - Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents.
We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks.
This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z) - Evolving Inborn Knowledge For Fast Adaptation in Dynamic POMDP Problems [5.23587935428994]
In this paper, we exploit the highly adaptive nature of neuromodulated neural networks to evolve a controller that uses the latent space of an autoencoder in a POMDP.
The integration of inborn knowledge and online plasticity enabled fast adaptation and better performance in comparison to some non-evolutionary meta-reinforcement learning algorithms.
arXiv Detail & Related papers (2020-04-27T14:55:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.