Emergent Dominance Hierarchies in Reinforcement Learning Agents
- URL: http://arxiv.org/abs/2401.12258v7
- Date: Sat, 22 Jun 2024 11:44:33 GMT
- Title: Emergent Dominance Hierarchies in Reinforcement Learning Agents
- Authors: Ram Rachum, Yonatan Nakar, Bill Tomlinson, Nitay Alon, Reuth Mirsky,
- Abstract summary: Modern Reinforcement Learning (RL) algorithms are able to outperform humans in a wide variety of tasks.
We show that populations of RL agents can invent, learn, enforce, and transmit a dominance hierarchy to new populations.
The dominance hierarchies that emerge have a similar structure to those studied in chickens, mice, fish, and other species.
- Score: 5.451419559128312
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern Reinforcement Learning (RL) algorithms are able to outperform humans in a wide variety of tasks. Multi-agent reinforcement learning (MARL) settings present additional challenges, and successful cooperation in mixed-motive groups of agents depends on a delicate balancing act between individual and group objectives. Social conventions and norms, often inspired by human institutions, are used as tools for striking this balance. In this paper, we examine a fundamental, well-studied social convention that underlies cooperation in both animal and human societies: dominance hierarchies. We adapt the ethological theory of dominance hierarchies to artificial agents, borrowing the established terminology and definitions with as few amendments as possible. We demonstrate that populations of RL agents, operating without explicit programming or intrinsic rewards, can invent, learn, enforce, and transmit a dominance hierarchy to new populations. The dominance hierarchies that emerge have a similar structure to those studied in chickens, mice, fish, and other species.
Related papers
- SocialGFs: Learning Social Gradient Fields for Multi-Agent Reinforcement Learning [58.84311336011451]
We propose a novel gradient-based state representation for multi-agent reinforcement learning.
We employ denoising score matching to learn the social gradient fields (SocialGFs) from offline samples.
In practice, we integrate SocialGFs into the widely used multi-agent reinforcement learning algorithms, e.g., MAPPO.
arXiv Detail & Related papers (2024-05-03T04:12:19Z) - Mathematics of multi-agent learning systems at the interface of game
theory and artificial intelligence [0.8049333067399385]
Evolutionary Game Theory and Artificial Intelligence are two fields that, at first glance, might seem distinct, but they have notable connections and intersections.
The former focuses on the evolution of behaviors (or strategies) in a population, where individuals interact with others and update their strategies based on imitation (or social learning)
The latter, meanwhile, is centered on machine learning algorithms and (deep) neural networks.
arXiv Detail & Related papers (2024-03-09T17:36:54Z) - Innate-Values-driven Reinforcement Learning for Cooperative Multi-Agent
Systems [1.8220718426493654]
Innate values describe agents' intrinsic motivations, which reflect their inherent interests and preferences to pursue goals.
The essence of reinforcement learning (RL) is learning from interaction based on reward-driven (such as utilities) behaviors.
This paper proposes a hierarchical compound value reinforcement learning model -- intrinsic-driven reinforcement learning -- to describe the complex behaviors of multi-agent interaction.
arXiv Detail & Related papers (2024-01-10T22:51:10Z) - Agent Alignment in Evolving Social Norms [65.45423591744434]
We propose an evolutionary framework for agent evolution and alignment, named EvolutionaryAgent.
In an environment where social norms continuously evolve, agents better adapted to the current social norms will have a higher probability of survival and proliferation.
We show that EvolutionaryAgent can align progressively better with the evolving social norms while maintaining its proficiency in general tasks.
arXiv Detail & Related papers (2024-01-09T15:44:44Z) - ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward [29.737986509769808]
We propose a self-supervised intrinsic reward ELIGN - expectation alignment.
Similar to how animals collaborate in a decentralized manner with those in their vicinity, agents trained with expectation alignment learn behaviors that match their neighbors' expectations.
We show that agent coordination improves through expectation alignment because agents learn to divide tasks amongst themselves, break coordination symmetries, and confuse adversaries.
arXiv Detail & Related papers (2022-10-09T22:24:44Z) - LDSA: Learning Dynamic Subtask Assignment in Cooperative Multi-Agent
Reinforcement Learning [122.47938710284784]
We propose a novel framework for learning dynamic subtask assignment (LDSA) in cooperative MARL.
To reasonably assign agents to different subtasks, we propose an ability-based subtask selection strategy.
We show that LDSA learns reasonable and effective subtask assignment for better collaboration.
arXiv Detail & Related papers (2022-05-05T10:46:16Z) - Generalization in Cooperative Multi-Agent Systems [49.16349318581611]
We study the theoretical underpinnings of Combinatorial Generalization (CG) for cooperative multi-agent systems.
CG is a highly desirable trait for autonomous systems as it can increase their utility and deployability across a wide range of applications.
arXiv Detail & Related papers (2022-01-31T21:39:56Z) - Autonomous Reinforcement Learning: Formalism and Benchmarking [106.25788536376007]
Real-world embodied learning, such as that performed by humans and animals, is situated in a continual, non-episodic world.
Common benchmark tasks in RL are episodic, with the environment resetting between trials to provide the agent with multiple attempts.
This discrepancy presents a major challenge when attempting to take RL algorithms developed for episodic simulated environments and run them on real-world platforms.
arXiv Detail & Related papers (2021-12-17T16:28:06Z) - Improved cooperation by balancing exploration and exploitation in
intertemporal social dilemma tasks [2.541277269153809]
We propose a new learning strategy for achieving coordination by incorporating a learning rate that can balance exploration and exploitation.
We show that agents that use the simple strategy improve a relatively collective return in a decision task called the intertemporal social dilemma.
We also explore the effects of the diversity of learning rates on the population of reinforcement learning agents and show that agents trained in heterogeneous populations develop particularly coordinated policies.
arXiv Detail & Related papers (2021-10-19T08:40:56Z) - Provable Hierarchy-Based Meta-Reinforcement Learning [50.17896588738377]
We analyze HRL in the meta-RL setting, where learner learns latent hierarchical structure during meta-training for use in a downstream task.
We provide "diversity conditions" which, together with a tractable optimism-based algorithm, guarantee sample-efficient recovery of this natural hierarchy.
Our bounds incorporate common notions in HRL literature such as temporal and state/action abstractions, suggesting that our setting and analysis capture important features of HRL in practice.
arXiv Detail & Related papers (2021-10-18T17:56:02Z) - Affinity-Based Hierarchical Learning of Dependent Concepts for Human
Activity Recognition [6.187780920448871]
We show that the organization of overlapping classes into hierarchies considerably improves classification performances.
This is particularly true in the case of activity recognition tasks featured in the SHL dataset.
We propose an approach based on transfer affinity among the classes to determine an optimal hierarchy for the learning process.
arXiv Detail & Related papers (2021-04-11T01:08:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.