Explainable Action Advising for Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2211.07882v3
- Date: Fri, 16 Jun 2023 15:20:36 GMT
- Title: Explainable Action Advising for Multi-Agent Reinforcement Learning
- Authors: Yue Guo, Joseph Campbell, Simon Stepputtis, Ruiyu Li, Dana Hughes, Fei
Fang, Katia Sycara
- Abstract summary: Action advising is a knowledge transfer technique for reinforcement learning based on the teacher-student paradigm.
We introduce Explainable Action Advising, in which the teacher provides action advice as well as associated explanations indicating why the action was chosen.
This allows the student to self-reflect on what it has learned, enabling generalization advice and leading to improved sample efficiency and learning performance.
- Score: 32.49380192781649
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Action advising is a knowledge transfer technique for reinforcement learning
based on the teacher-student paradigm. An expert teacher provides advice to a
student during training in order to improve the student's sample efficiency and
policy performance. Such advice is commonly given in the form of state-action
pairs. However, it makes it difficult for the student to reason with and apply
to novel states. We introduce Explainable Action Advising, in which the teacher
provides action advice as well as associated explanations indicating why the
action was chosen. This allows the student to self-reflect on what it has
learned, enabling advice generalization and leading to improved sample
efficiency and learning performance - even in environments where the teacher is
sub-optimal. We empirically show that our framework is effective in both
single-agent and multi-agent scenarios, yielding improved policy returns and
convergence rates when compared to state-of-the-art methods
Related papers
- RILe: Reinforced Imitation Learning [60.63173816209543]
RILe is a novel trainer-student system that learns a dynamic reward function based on the student's performance and alignment with expert demonstrations.
RILe enables better performance in complex settings where traditional methods falter, outperforming existing methods by 2x in complex simulated robot-locomotion tasks.
arXiv Detail & Related papers (2024-06-12T17:56:31Z) - Opinion-Guided Reinforcement Learning [0.46040036610482665]
We present a method to guide reinforcement learning agents through opinions.
We evaluate it with synthetic (oracle) and human advisors, at different levels of uncertainty.
Our results indicate that opinions, even if uncertain, improve the performance of reinforcement learning agents.
arXiv Detail & Related papers (2024-05-27T15:52:27Z) - Adaptive Teaching in Heterogeneous Agents: Balancing Surprise in Sparse Reward Scenarios [3.638198517970729]
Learning from Demonstration can be an efficient way to train systems with analogous agents.
However, naively replicating demonstrations that are out of bounds for the Student's capability can limit efficient learning.
We present a Teacher-Student learning framework specifically tailored to address the challenge of heterogeneity between the Teacher and Student agents.
arXiv Detail & Related papers (2024-05-23T05:52:42Z) - Reinforcement Teaching [43.80089037901853]
We propose Reinforcement Teaching: a framework for meta-learning in which a teaching policy is learned, through reinforcement, to control a student's learning process.
The student's learning process is modelled as a Markov reward process and the teacher, with its action-space, interacts with the induced Markov decision process.
We show that, for many learning processes, the student's learnable parameters form a Markov state. To avoid having the teacher learn directly from parameters, we propose the Embedder that learns a representation of a student's state from its input/output behaviour.
arXiv Detail & Related papers (2022-04-25T18:04:17Z) - Iterative Teacher-Aware Learning [136.05341445369265]
In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency.
We propose a gradient optimization based teacher-aware learner who can incorporate teacher's cooperative intention into the likelihood function.
arXiv Detail & Related papers (2021-10-01T00:27:47Z) - Distribution Matching for Machine Teaching [64.39292542263286]
Machine teaching is an inverse problem of machine learning that aims at steering the student learner towards its target hypothesis.
Previous studies on machine teaching focused on balancing the teaching risk and cost to find those best teaching examples.
This paper presents a distribution matching-based machine teaching strategy.
arXiv Detail & Related papers (2021-05-06T09:32:57Z) - Action Advising with Advice Imitation in Deep Reinforcement Learning [0.5185131234265025]
Action advising is a peer-to-peer knowledge exchange technique built on the teacher-student paradigm.
We present an approach to enable the student agent to imitate previously acquired advice to reuse them directly in its exploration policy.
arXiv Detail & Related papers (2021-04-17T04:24:04Z) - Privacy-Preserving Teacher-Student Deep Reinforcement Learning [23.934121758649052]
We develop a private mechanism that protects the privacy of the teacher's training dataset.
We empirically show that the algorithm improves the student's learning upon convergence rate and utility.
arXiv Detail & Related papers (2021-02-18T20:15:09Z) - Learning Student-Friendly Teacher Networks for Knowledge Distillation [50.11640959363315]
We propose a novel knowledge distillation approach to facilitate the transfer of dark knowledge from a teacher to a student.
Contrary to most of the existing methods that rely on effective training of student models given pretrained teachers, we aim to learn the teacher models that are friendly to students.
arXiv Detail & Related papers (2021-02-12T07:00:17Z) - Generative Inverse Deep Reinforcement Learning for Online Recommendation [62.09946317831129]
We propose a novel inverse reinforcement learning approach, namely InvRec, for online recommendation.
InvRec extracts the reward function from user's behaviors automatically, for online recommendation.
arXiv Detail & Related papers (2020-11-04T12:12:25Z) - Dual Policy Distillation [58.43610940026261]
Policy distillation, which transfers a teacher policy to a student policy, has achieved great success in challenging tasks of deep reinforcement learning.
In this work, we introduce dual policy distillation(DPD), a student-student framework in which two learners operate on the same environment to explore different perspectives of the environment.
The key challenge in developing this dual learning framework is to identify the beneficial knowledge from the peer learner for contemporary learning-based reinforcement learning algorithms.
arXiv Detail & Related papers (2020-06-07T06:49:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.