A User Study on Explainable Online Reinforcement Learning for Adaptive
Systems
- URL: http://arxiv.org/abs/2307.04098v1
- Date: Sun, 9 Jul 2023 05:12:42 GMT
- Title: A User Study on Explainable Online Reinforcement Learning for Adaptive
Systems
- Authors: Andreas Metzger and Jan Laufer and Felix Feit and Klaus Pohl
- Abstract summary: Online reinforcement learning (RL) is increasingly used for realizing adaptive systems in the presence of design time uncertainty.
Deep RL gaining interest, the learned knowledge is no longer explicitly represented, but is represented as a neural network.
XRL-DINE provides visual insights into why certain decisions were made at important time points.
- Score: 0.802904964931021
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Online reinforcement learning (RL) is increasingly used for realizing
adaptive systems in the presence of design time uncertainty. Online RL
facilitates learning from actual operational data and thereby leverages
feedback only available at runtime. However, Online RL requires the definition
of an effective and correct reward function, which quantifies the feedback to
the RL algorithm and thereby guides learning. With Deep RL gaining interest,
the learned knowledge is no longer explicitly represented, but is represented
as a neural network. For a human, it becomes practically impossible to relate
the parametrization of the neural network to concrete RL decisions. Deep RL
thus essentially appears as a black box, which severely limits the debugging of
adaptive systems. We previously introduced the explainable RL technique
XRL-DINE, which provides visual insights into why certain decisions were made
at important time points. Here, we introduce an empirical user study involving
54 software engineers from academia and industry to assess (1) the performance
of software engineers when performing different tasks using XRL-DINE and (2)
the perceived usefulness and ease of use of XRL-DINE.
Related papers
- Unsupervised-to-Online Reinforcement Learning [59.910638327123394]
Unsupervised-to-online RL (U2O RL) replaces domain-specific supervised offline RL with unsupervised offline RL.
U2O RL not only enables reusing a single pre-trained model for multiple downstream tasks, but also learns better representations.
We empirically demonstrate that U2O RL achieves strong performance that matches or even outperforms previous offline-to-online RL approaches.
arXiv Detail & Related papers (2024-08-27T05:23:45Z) - Is Value Learning Really the Main Bottleneck in Offline RL? [70.54708989409409]
We show that the choice of a policy extraction algorithm significantly affects the performance and scalability of offline RL.
We propose two simple test-time policy improvement methods and show that these methods lead to better performance.
arXiv Detail & Related papers (2024-06-13T17:07:49Z) - Abstracted Trajectory Visualization for Explainability in Reinforcement
Learning [2.1028463367241033]
Explainable AI (XAI) has demonstrated the potential to help reinforcement learning (RL) practitioners to understand how RL models work.
XAI for users who do not have RL expertise (non-RL experts) has not been studied sufficiently.
We argue that abstracted trajectories, that depicts transitions between the major states of the RL model, will be useful for non-RL experts to build a mental model of the agents.
arXiv Detail & Related papers (2024-02-05T21:17:44Z) - A Survey on Explainable Reinforcement Learning: Concepts, Algorithms,
Challenges [38.70863329476517]
Reinforcement Learning (RL) is a popular machine learning paradigm where intelligent agents interact with the environment to fulfill a long-term goal.
Despite the encouraging results achieved, the deep neural network-based backbone is widely deemed as a black box that impedes practitioners to trust and employ trained agents in realistic scenarios where high security and reliability are essential.
To alleviate this issue, a large volume of literature devoted to shedding light on the inner workings of the intelligent agents has been proposed, by constructing intrinsic interpretability or post-hoc explainability.
arXiv Detail & Related papers (2022-11-12T13:52:06Z) - Explaining Online Reinforcement Learning Decisions of Self-Adaptive
Systems [0.90238471756546]
Design time uncertainty poses an important challenge when developing a self-adaptive system.
Online reinforcement learning is an emerging approach to realizing self-adaptive systems in the presence of design time uncertainty.
Deep RL represents learned knowledge as a neural network whereby it can generalize over unseen inputs.
arXiv Detail & Related papers (2022-10-12T05:38:27Z) - Contrastive Learning as Goal-Conditioned Reinforcement Learning [147.28638631734486]
In reinforcement learning (RL), it is easier to solve a task if given a good representation.
While deep RL should automatically acquire such good representations, prior work often finds that learning representations in an end-to-end fashion is unstable.
We show (contrastive) representation learning methods can be cast as RL algorithms in their own right.
arXiv Detail & Related papers (2022-06-15T14:34:15Z) - RvS: What is Essential for Offline RL via Supervised Learning? [77.91045677562802]
Recent work has shown that supervised learning alone, without temporal difference (TD) learning, can be remarkably effective for offline RL.
In every environment suite we consider simply maximizing likelihood with two-layer feedforward is competitive.
They also probe the limits of existing RvS methods, which are comparatively weak on random data.
arXiv Detail & Related papers (2021-12-20T18:55:16Z) - A Workflow for Offline Model-Free Robotic Reinforcement Learning [117.07743713715291]
offline reinforcement learning (RL) enables learning control policies by utilizing only prior experience, without any online interaction.
We develop a practical workflow for using offline RL analogous to the relatively well-understood for supervised learning problems.
We demonstrate the efficacy of this workflow in producing effective policies without any online tuning.
arXiv Detail & Related papers (2021-09-22T16:03:29Z) - POAR: Efficient Policy Optimization via Online Abstract State
Representation Learning [6.171331561029968]
State Representation Learning (SRL) is proposed to specifically learn to encode task-relevant features from complex sensory data into low-dimensional states.
We introduce a new SRL prior called domain resemblance to leverage expert demonstration to improve SRL interpretations.
We empirically verify POAR to efficiently handle tasks in high dimensions and facilitate training real-life robots directly from scratch.
arXiv Detail & Related papers (2021-09-17T16:52:03Z) - Heuristic-Guided Reinforcement Learning [31.056460162389783]
Tabula rasa RL algorithms require environment interactions or computation that scales with the horizon of the decision-making task.
Our framework can be viewed as a horizon-based regularization for controlling bias and variance in RL under a finite interaction budget.
In particular, we introduce the novel concept of an "improvable" -- a that allows an RL agent to extrapolate beyond its prior knowledge.
arXiv Detail & Related papers (2021-06-05T00:04:09Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.