Discovering Individual Rewards in Collective Behavior through Inverse
Multi-Agent Reinforcement Learning
- URL: http://arxiv.org/abs/2305.10548v1
- Date: Wed, 17 May 2023 20:07:30 GMT
- Title: Discovering Individual Rewards in Collective Behavior through Inverse
Multi-Agent Reinforcement Learning
- Authors: Daniel Waelchli, Pascal Weber, Petros Koumoutsakos
- Abstract summary: We introduce an off-policy inverse multi-agent reinforcement learning algorithm (IMARL)
By leveraging demonstrations, our algorithm automatically uncovers the reward function and learns an effective policy for the agents.
The proposed IMARL algorithm is a significant step towards understanding collective dynamics from the perspective of its constituents.
- Score: 3.4437947384641032
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The discovery of individual objectives in collective behavior of complex
dynamical systems such as fish schools and bacteria colonies is a long-standing
challenge. Inverse reinforcement learning is a potent approach for addressing
this challenge but its applicability to dynamical systems, involving continuous
state-action spaces and multiple interacting agents, has been limited. In this
study, we tackle this challenge by introducing an off-policy inverse
multi-agent reinforcement learning algorithm (IMARL). Our approach combines the
ReF-ER techniques with guided cost learning. By leveraging demonstrations, our
algorithm automatically uncovers the reward function and learns an effective
policy for the agents. Through extensive experimentation, we demonstrate that
the proposed policy captures the behavior observed in the provided data, and
achieves promising results across problem domains including single agent models
in the OpenAI gym and multi-agent models of schooling behavior. The present
study shows that the proposed IMARL algorithm is a significant step towards
understanding collective dynamics from the perspective of its constituents, and
showcases its value as a tool for studying complex physical systems exhibiting
collective behaviour.
Related papers
- Large Language Model-based Human-Agent Collaboration for Complex Task
Solving [94.3914058341565]
We introduce the problem of Large Language Models (LLMs)-based human-agent collaboration for complex task-solving.
We propose a Reinforcement Learning-based Human-Agent Collaboration method, ReHAC.
This approach includes a policy model designed to determine the most opportune stages for human intervention within the task-solving process.
arXiv Detail & Related papers (2024-02-20T11:03:36Z) - Joint Intrinsic Motivation for Coordinated Exploration in Multi-Agent
Deep Reinforcement Learning [0.0]
We propose an approach for rewarding strategies where agents collectively exhibit novel behaviors.
Jim rewards joint trajectories based on a centralized measure of novelty designed to function in continuous environments.
Results show that joint exploration is crucial for solving tasks where the optimal strategy requires a high level of coordination.
arXiv Detail & Related papers (2024-02-06T13:02:00Z) - DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement
Learning [84.22561239481901]
We propose a new approach that enables agents to learn whether their behaviors should be consistent with that of other agents.
We evaluate DCIR in multiple environments including Multi-agent Particle, Google Research Football and StarCraft II Micromanagement.
arXiv Detail & Related papers (2023-12-10T06:03:57Z) - Inverse Factorized Q-Learning for Cooperative Multi-agent Imitation
Learning [13.060023718506917]
imitation learning (IL) is a problem of learning to mimic expert behaviors from demonstrations in cooperative multi-agent systems.
We introduce a novel multi-agent IL algorithm designed to address these challenges.
Our approach enables the centralized learning by leveraging mixing networks to aggregate decentralized Q functions.
arXiv Detail & Related papers (2023-10-10T17:11:20Z) - Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning.
We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z) - MORAL: Aligning AI with Human Norms through Multi-Objective Reinforced
Active Learning [14.06682547001011]
State-of-the art methods typically focus on learning a single reward model.
We propose Multi-Objective Reinforced Active Learning (MORAL), a novel method for combining diverse demonstrations of social norms.
Our approach is able to interactively tune a deep RL agent towards a variety of preferences, while eliminating the need for computing multiple policies.
arXiv Detail & Related papers (2021-12-30T19:21:03Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - Multiagent Deep Reinforcement Learning: Challenges and Directions
Towards Human-Like Approaches [0.0]
We present the most common multiagent problem representations and their main challenges.
We identify five research areas that address one or more of these challenges.
We suggest that, for multiagent reinforcement learning to be successful, future research addresses these challenges with an interdisciplinary approach.
arXiv Detail & Related papers (2021-06-29T19:53:15Z) - Seeing Differently, Acting Similarly: Imitation Learning with
Heterogeneous Observations [126.78199124026398]
In many real-world imitation learning tasks, the demonstrator and the learner have to act in different but full observation spaces.
In this work, we model the above learning problem as Heterogeneous Observations Learning (HOIL)
We propose the Importance Weighting with REjection (IWRE) algorithm based on the techniques of importance-weighting, learning with rejection, and active querying to solve the key challenge of occupancy measure matching.
arXiv Detail & Related papers (2021-06-17T05:44:04Z) - Behavior Priors for Efficient Reinforcement Learning [97.81587970962232]
We consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors.
We discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) and mutual information and curiosity based objectives.
We demonstrate the effectiveness of our framework by applying it to a range of simulated continuous control domains.
arXiv Detail & Related papers (2020-10-27T13:17:18Z) - Human AI interaction loop training: New approach for interactive
reinforcement learning [0.0]
Reinforcement Learning (RL) in various decision-making tasks of machine learning provides effective results with an agent learning from a stand-alone reward function.
RL presents unique challenges with large amounts of environment states and action spaces, as well as in the determination of rewards.
Imitation Learning (IL) offers a promising solution for those challenges using a teacher.
arXiv Detail & Related papers (2020-03-09T15:27:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.