Social Interpretable Reinforcement Learning
- URL: http://arxiv.org/abs/2401.15480v1
- Date: Sat, 27 Jan 2024 19:05:21 GMT
- Title: Social Interpretable Reinforcement Learning
- Authors: Leonardo Lucio Custode, Giovanni Iacca
- Abstract summary: Social Interpretable RL (SIRL) is inspired by social learning principles to improve learning efficiency.
Our results on six well-known benchmarks show that SIRL reaches state-of-the-art performance.
- Score: 4.242435932138821
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Reinforcement Learning (RL) bears the promise of being an enabling technology
for many applications. However, since most of the literature in the field is
currently focused on opaque models, the use of RL in high-stakes scenarios,
where interpretability is crucial, is still limited. Recently, some approaches
to interpretable RL, e.g., based on Decision Trees, have been proposed, but one
of the main limitations of these techniques is their training cost. To overcome
this limitation, we propose a new population-based method, called Social
Interpretable RL (SIRL), inspired by social learning principles, to improve
learning efficiency. Our method mimics a social learning process, where each
agent in a group learns to solve a given task based both on its own individual
experience as well as the experience acquired together with its peers. Our
approach is divided into two phases. In the \emph{collaborative phase}, all the
agents in the population interact with a shared instance of the environment,
where each agent observes the state and independently proposes an action. Then,
voting is performed to choose the action that will actually be performed in the
environment. In the \emph{individual phase}, each agent refines its individual
performance by interacting with its own instance of the environment. This
mechanism makes the agents experience a larger number of episodes while
simultaneously reducing the computational cost of the process. Our results on
six well-known benchmarks show that SIRL reaches state-of-the-art performance
w.r.t. the alternative interpretable methods from the literature.
Related papers
- Demonstration-free Autonomous Reinforcement Learning via Implicit and
Bidirectional Curriculum [22.32327908453603]
We propose a demonstration-free reinforcement learning algorithm via Implicit and Bi-directional Curriculum (IBC)
With an auxiliary agent that is conditionally activated upon learning progress and a bidirectional goal curriculum based on optimal transport, our method outperforms previous methods.
arXiv Detail & Related papers (2023-05-17T04:31:36Z) - A State-Distribution Matching Approach to Non-Episodic Reinforcement
Learning [61.406020873047794]
A major hurdle to real-world application arises from the development of algorithms in an episodic setting.
We propose a new method, MEDAL, that trains the backward policy to match the state distribution in the provided demonstrations.
Our experiments show that MEDAL matches or outperforms prior methods on three sparse-reward continuous control tasks.
arXiv Detail & Related papers (2022-05-11T00:06:29Z) - Autonomous Reinforcement Learning: Formalism and Benchmarking [106.25788536376007]
Real-world embodied learning, such as that performed by humans and animals, is situated in a continual, non-episodic world.
Common benchmark tasks in RL are episodic, with the environment resetting between trials to provide the agent with multiple attempts.
This discrepancy presents a major challenge when attempting to take RL algorithms developed for episodic simulated environments and run them on real-world platforms.
arXiv Detail & Related papers (2021-12-17T16:28:06Z) - Persistent Reinforcement Learning via Subgoal Curricula [114.83989499740193]
Value-accelerated Persistent Reinforcement Learning (VaPRL) generates a curriculum of initial states.
VaPRL reduces the interventions required by three orders of magnitude compared to episodic reinforcement learning.
arXiv Detail & Related papers (2021-07-27T16:39:45Z) - Conditional Meta-Learning of Linear Representations [57.90025697492041]
Standard meta-learning for representation learning aims to find a common representation to be shared across multiple tasks.
In this work we overcome this issue by inferring a conditioning function, mapping the tasks' side information into a representation tailored to the task at hand.
We propose a meta-algorithm capable of leveraging this advantage in practice.
arXiv Detail & Related papers (2021-03-30T12:02:14Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z) - Parallel Knowledge Transfer in Multi-Agent Reinforcement Learning [0.2538209532048867]
This paper proposes a novel knowledge transfer framework in MARL, PAT (Parallel Attentional Transfer)
We design two acting modes in PAT, student mode and self-learning mode.
When agents are unfamiliar with the environment, the shared attention mechanism in student mode effectively selects learning knowledge from other agents to decide agents' actions.
arXiv Detail & Related papers (2020-03-29T17:42:00Z) - Human AI interaction loop training: New approach for interactive
reinforcement learning [0.0]
Reinforcement Learning (RL) in various decision-making tasks of machine learning provides effective results with an agent learning from a stand-alone reward function.
RL presents unique challenges with large amounts of environment states and action spaces, as well as in the determination of rewards.
Imitation Learning (IL) offers a promising solution for those challenges using a teacher.
arXiv Detail & Related papers (2020-03-09T15:27:48Z) - On the interaction between supervision and self-play in emergent
communication [82.290338507106]
We investigate the relationship between two categories of learning signals with the ultimate goal of improving sample efficiency.
We find that first training agents via supervised learning on human data followed by self-play outperforms the converse.
arXiv Detail & Related papers (2020-02-04T02:35:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.