Related papers: Explaining Agent's Decision-making in a Hierarchical Reinforcement Learning Scenario

Explaining Agent's Decision-making in a Hierarchical Reinforcement Learning Scenario

URL: http://arxiv.org/abs/2212.06967v1
Date: Wed, 14 Dec 2022 01:18:45 GMT
Title: Explaining Agent's Decision-making in a Hierarchical Reinforcement Learning Scenario
Authors: Hugo Mu\~noz, Ernesto Portugal, Angel Ayala, Bruno Fernandes, Francisco Cruz
Abstract summary: Reinforcement learning is a machine learning approach based on behavioral psychology. In this work, we make use of the memory-based explainable reinforcement learning method in a hierarchical environment composed of sub-tasks.
Score: 0.6643086804649938
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning is a machine learning approach based on behavioral psychology. It is focused on learning agents that can acquire knowledge and learn to carry out new tasks by interacting with the environment. However, a problem occurs when reinforcement learning is used in critical contexts where the users of the system need to have more information and reliability for the actions executed by an agent. In this regard, explainable reinforcement learning seeks to provide to an agent in training with methods in order to explain its behavior in such a way that users with no experience in machine learning could understand the agent's behavior. One of these is the memory-based explainable reinforcement learning method that is used to compute probabilities of success for each state-action pair using an episodic memory. In this work, we propose to make use of the memory-based explainable reinforcement learning method in a hierarchical environment composed of sub-tasks that need to be first addressed to solve a more complex task. The end goal is to verify if it is possible to provide to the agent the ability to explain its actions in the global task as well as in the sub-tasks. The results obtained showed that it is possible to use the memory-based method in hierarchical environments with high-level tasks and compute the probabilities of success to be used as a basis for explaining the agent's behavior.

Related papers

Online inductive learning from answer sets for efficient reinforcement learning exploration [52.03682298194168]
We exploit inductive learning of answer set programs to learn a set of logical rules representing an explainable approximation of the agent policy. We then perform answer set reasoning on the learned rules to guide the exploration of the learning agent at the next batch. Our methodology produces a significant boost in the discounted return achieved by the agent, even in the first batches of training.
arXiv Detail & Related papers (2025-01-13T16:13:22Z)
SECURE: Semantics-aware Embodied Conversation under Unawareness for Lifelong Robot Learning [17.125080112897102]
SECURE is an interactive task learning framework designed to solve such problems. It uses embodied conversation to fix its deficient domain model. We demonstrate that learning to solve rearrangement under unawareness is more data efficient when the agent is semantics-aware.
arXiv Detail & Related papers (2024-09-26T11:40:07Z)
REVEAL-IT: REinforcement learning with Visibility of Evolving Agent poLicy for InTerpretability [23.81322529587759]
REVEAL-IT is a novel framework for explaining the learning process of an agent in complex environments. We visualize the policy structure and the agent's learning process for various training tasks. A GNN-based explainer learns to highlight the most important section of the policy, providing a more clear and robust explanation of the agent's learning process.
arXiv Detail & Related papers (2024-06-20T11:29:26Z)
RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning. Our proposed method uses reinforcement learning with user intervention signals themselves as rewards. This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z)
Incremental procedural and sensorimotor learning in cognitive humanoid robots [52.77024349608834]
This work presents a cognitive agent that can learn procedures incrementally. We show the cognitive functions required in each substage and how adding new functions helps address tasks previously unsolved by the agent. Results show that this approach is capable of solving complex tasks incrementally.
arXiv Detail & Related papers (2023-04-30T22:51:31Z)
Learning When and What to Ask: a Hierarchical Reinforcement Learning Framework [17.017688226277834]
We formulate a hierarchical reinforcement learning framework for learning to decide when to request additional information from humans. Results on a simulated human-assisted navigation problem demonstrate the effectiveness of our framework.
arXiv Detail & Related papers (2021-10-14T01:30:36Z)
Learning an Explicit Hyperparameter Prediction Function Conditioned on Tasks [62.63852372239708]
Meta learning aims to learn the learning methodology for machine learning from observed tasks, so as to generalize to new query tasks. We interpret such learning methodology as learning an explicit hyper- parameter prediction function shared by all training tasks. Such setting guarantees that the meta-learned learning methodology is able to flexibly fit diverse query tasks.
arXiv Detail & Related papers (2021-07-06T04:05:08Z)
PEBBLE: Feedback-Efficient Interactive Reinforcement Learning via Relabeling Experience and Unsupervised Pre-training [94.87393610927812]
We present an off-policy, interactive reinforcement learning algorithm that capitalizes on the strengths of both feedback and off-policy learning. We demonstrate that our approach is capable of learning tasks of higher complexity than previously considered by human-in-the-loop methods.
arXiv Detail & Related papers (2021-06-09T14:10:50Z)
Domain-Robust Visual Imitation Learning with Mutual Information Constraints [0.0]
We introduce a new algorithm called Disentangling Generative Adversarial Imitation Learning (DisentanGAIL) Our algorithm enables autonomous agents to learn directly from high dimensional observations of an expert performing a task.
arXiv Detail & Related papers (2021-03-08T21:18:58Z)
Coverage as a Principle for Discovering Transferable Behavior in Reinforcement Learning [16.12658895065585]
We argue that representation alone is not enough for efficient transfer in challenging domains and explore how to transfer knowledge through behavior. The behavior of pre-trained policies may be used for solving the task at hand (exploitation) or for collecting useful data to solve the problem (exploration)
arXiv Detail & Related papers (2021-02-24T16:51:02Z)
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials. We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
Explainable robotic systems: Understanding goal-driven actions in a reinforcement learning scenario [1.671353192305391]
In reinforcement learning scenarios, a great effort has been focused on providing explanations using data-driven approaches. In this work, we focus rather on the decision-making process of reinforcement learning agents performing a task in a robotic scenario. We use the probability of success computed by three different proposed approaches: memory-based, learning-based, and introspection-based.
arXiv Detail & Related papers (2020-06-24T10:51:14Z)
Meta-Reinforcement Learning Robust to Distributional Shift via Model Identification and Experience Relabeling [126.69933134648541]
We present a meta-reinforcement learning algorithm that is both efficient and extrapolates well when faced with out-of-distribution tasks at test time. Our method is based on a simple insight: we recognize that dynamics models can be adapted efficiently and consistently with off-policy data.
arXiv Detail & Related papers (2020-06-12T13:34:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.