The Emergence of Individuality
- URL: http://arxiv.org/abs/2006.05842v2
- Date: Mon, 18 Oct 2021 08:12:57 GMT
- Title: The Emergence of Individuality
- Authors: Jiechuan Jiang and Zongqing Lu
- Abstract summary: We propose a simple yet efficient method for the emergence of individuality (EOI) in multi-agent reinforcement learning (MARL)
EOI learns a probabilistic classifier that predicts a probability distribution over agents given their observation.
We show that EOI significantly outperforms existing methods in a variety of multi-agent cooperative scenarios.
- Score: 33.4713690991284
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Individuality is essential in human society, which induces the division of
labor and thus improves the efficiency and productivity. Similarly, it should
also be the key to multi-agent cooperation. Inspired by that individuality is
of being an individual separate from others, we propose a simple yet efficient
method for the emergence of individuality (EOI) in multi-agent reinforcement
learning (MARL). EOI learns a probabilistic classifier that predicts a
probability distribution over agents given their observation and gives each
agent an intrinsic reward of being correctly predicted by the classifier. The
intrinsic reward encourages the agents to visit their own familiar
observations, and learning the classifier by such observations makes the
intrinsic reward signals stronger and the agents more identifiable. To further
enhance the intrinsic reward and promote the emergence of individuality, two
regularizers are proposed to increase the discriminability of the classifier.
We implement EOI on top of popular MARL algorithms. Empirically, we show that
EOI significantly outperforms existing methods in a variety of multi-agent
cooperative scenarios.
Related papers
- Enhancing Heterogeneous Multi-Agent Cooperation in Decentralized MARL via GNN-driven Intrinsic Rewards [1.179778723980276]
Multi-agent Reinforcement Learning (MARL) is emerging as a key framework for sequential decision-making and control tasks.
The deployment of these systems in real-world scenarios often requires decentralized training, a diverse set of agents, and learning from infrequent environmental reward signals.
We propose the CoHet algorithm, which utilizes a novel Graph Neural Network (GNN) based intrinsic motivation to facilitate the learning of heterogeneous agent policies.
arXiv Detail & Related papers (2024-08-12T21:38:40Z) - DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement
Learning [84.22561239481901]
We propose a new approach that enables agents to learn whether their behaviors should be consistent with that of other agents.
We evaluate DCIR in multiple environments including Multi-agent Particle, Google Research Football and StarCraft II Micromanagement.
arXiv Detail & Related papers (2023-12-10T06:03:57Z) - Unilaterally Aggregated Contrastive Learning with Hierarchical
Augmentation for Anomaly Detection [64.50126371767476]
We propose Unilaterally Aggregated Contrastive Learning with Hierarchical Augmentation (UniCon-HA)
We explicitly encourage the concentration of inliers and the dispersion of virtual outliers via supervised and unsupervised contrastive losses.
Our method is evaluated under three AD settings including unlabeled one-class, unlabeled multi-class, and labeled multi-class.
arXiv Detail & Related papers (2023-08-20T04:01:50Z) - STRAPPER: Preference-based Reinforcement Learning via Self-training
Augmentation and Peer Regularization [18.811470043767713]
Preference-based reinforcement learning (PbRL) promises to learn a complex reward function with binary human preference.
We present a self-training method along with our proposed peer regularization, which penalizes the reward model memorizing uninformative labels and acquires confident predictions.
arXiv Detail & Related papers (2023-07-19T00:31:58Z) - Learning to Incentivize Other Learning Agents [73.03133692589532]
We show how to equip RL agents with the ability to give rewards directly to other agents, using a learned incentive function.
Such agents significantly outperform standard RL and opponent-shaping agents in challenging general-sum Markov games.
Our work points toward more opportunities and challenges along the path to ensure the common good in a multi-agent future.
arXiv Detail & Related papers (2020-06-10T20:12:38Z) - Randomized Entity-wise Factorization for Multi-Agent Reinforcement
Learning [59.62721526353915]
Multi-agent settings in the real world often involve tasks with varying types and quantities of agents and non-agent entities.
Our method aims to leverage these commonalities by asking the question: What is the expected utility of each agent when only considering a randomly selected sub-group of its observed entities?''
arXiv Detail & Related papers (2020-06-07T18:28:41Z) - Scalable Multi-Agent Inverse Reinforcement Learning via
Actor-Attention-Critic [54.2180984002807]
Multi-agent adversarial inverse reinforcement learning (MA-AIRL) is a recent approach that applies single-agent AIRL to multi-agent problems.
We propose a multi-agent inverse RL algorithm that is more sample-efficient and scalable than previous works.
arXiv Detail & Related papers (2020-02-24T20:30:45Z) - Individual specialization in multi-task environments with multiagent
reinforcement learners [0.0]
There is a growing interest in Multi-Agent Reinforcement Learning (MARL) as the first steps towards building general intelligent agents.
Previous results point us towards increased conditions for coordination, efficiency/fairness, and common-pool resource sharing.
We further study coordination in multi-task environments where several rewarding tasks can be performed and thus agents don't necessarily need to perform well in all tasks, but under certain conditions may specialize.
arXiv Detail & Related papers (2019-12-29T15:20:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.