Balancing Rational and Other-Regarding Preferences in
Cooperative-Competitive Environments
- URL: http://arxiv.org/abs/2102.12307v1
- Date: Wed, 24 Feb 2021 14:35:32 GMT
- Title: Balancing Rational and Other-Regarding Preferences in
Cooperative-Competitive Environments
- Authors: Dmitry Ivanov, Vladimir Egorov, Aleksei Shpilman
- Abstract summary: Mixed environments are notorious for the conflicts of selfish and social interests.
We propose BAROCCO to balance individual and social incentives.
Our meta-algorithm is compatible with both Q-learning and Actor-Critic frameworks.
- Score: 4.705291741591329
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent reinforcement learning studies extensively explore the interplay
between cooperative and competitive behaviour in mixed environments. Unlike
cooperative environments where agents strive towards a common goal, mixed
environments are notorious for the conflicts of selfish and social interests.
As a consequence, purely rational agents often struggle to achieve and maintain
cooperation. A prevalent approach to induce cooperative behaviour is to assign
additional rewards based on other agents' well-being. However, this approach
suffers from the issue of multi-agent credit assignment, which can hinder
performance. This issue is efficiently alleviated in cooperative setting with
such state-of-the-art algorithms as QMIX and COMA. Still, when applied to mixed
environments, these algorithms may result in unfair allocation of rewards. We
propose BAROCCO, an extension of these algorithms capable to balance individual
and social incentives. The mechanism behind BAROCCO is to train two distinct
but interwoven components that jointly affect each agent's decisions. Our
meta-algorithm is compatible with both Q-learning and Actor-Critic frameworks.
We experimentally confirm the advantages over the existing methods and explore
the behavioural aspects of BAROCCO in two mixed multi-agent setups.
Related papers
- Learning to Balance Altruism and Self-interest Based on Empathy in Mixed-Motive Games [47.8980880888222]
Multi-agent scenarios often involve mixed motives, demanding altruistic agents capable of self-protection against potential exploitation.
We propose LASE Learning to balance Altruism and Self-interest based on Empathy.
LASE allocates a portion of its rewards to co-players as gifts, with this allocation adapting dynamically based on the social relationship.
arXiv Detail & Related papers (2024-10-10T12:30:56Z) - Emergent Cooperation under Uncertain Incentive Alignment [7.906156032228933]
We study how cooperation can arise among reinforcement learning agents in scenarios characterised by infrequent encounters.
We study the effects of mechanisms, such as reputation and intrinsic rewards, that have been proposed in the literature to foster cooperation in mixed-motives environments.
arXiv Detail & Related papers (2024-01-23T10:55:54Z) - Situation-Dependent Causal Influence-Based Cooperative Multi-agent
Reinforcement Learning [18.054709749075194]
We propose a novel MARL algorithm named Situation-Dependent Causal Influence-Based Cooperative Multi-agent Reinforcement Learning (SCIC)
Our approach aims to detect inter-agent causal influences in specific situations based on the criterion using causal intervention and conditional mutual information.
The resulting update links coordinated exploration and intrinsic reward distribution, which enhance overall collaboration and performance.
arXiv Detail & Related papers (2023-12-15T05:09:32Z) - CoMIX: A Multi-agent Reinforcement Learning Training Architecture for Efficient Decentralized Coordination and Independent Decision-Making [2.4555276449137042]
Robust coordination skills enable agents to operate cohesively in shared environments, together towards a common goal and, ideally, individually without hindering each other's progress.
This paper presents Coordinated QMIX, a novel training framework for decentralized agents that enables emergent coordination through flexible policies, allowing at the same time independent decision-making at individual level.
arXiv Detail & Related papers (2023-08-21T13:45:44Z) - Learning Reward Machines in Cooperative Multi-Agent Tasks [75.79805204646428]
This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL)
It combines cooperative task decomposition with the learning of reward machines (RMs) encoding the structure of the sub-tasks.
The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments.
arXiv Detail & Related papers (2023-03-24T15:12:28Z) - Adaptive Value Decomposition with Greedy Marginal Contribution
Computation for Cooperative Multi-Agent Reinforcement Learning [48.41925886860991]
Real-world cooperation often requires intensive coordination among agents simultaneously.
Traditional methods that learn the value function as a monotonic mixing of per-agent utilities cannot solve the tasks with non-monotonic returns.
We propose a novel explicit credit assignment method to address the non-monotonic problem.
arXiv Detail & Related papers (2023-02-14T07:23:59Z) - Revisiting QMIX: Discriminative Credit Assignment by Gradient Entropy
Regularization [126.87359177547455]
In cooperative multi-agent systems, agents jointly take actions and receive a team reward instead of individual rewards.
In the absence of individual reward signals, credit assignment mechanisms are usually introduced to discriminate the contributions of different agents.
We propose a new perspective on credit assignment measurement and empirically show that QMIX suffers limited discriminability on the assignment of credits to agents.
arXiv Detail & Related papers (2022-02-09T12:37:55Z) - Normative Disagreement as a Challenge for Cooperative AI [56.34005280792013]
We argue that typical cooperation-inducing learning algorithms fail to cooperate in bargaining problems.
We develop a class of norm-adaptive policies and show in experiments that these significantly increase cooperation.
arXiv Detail & Related papers (2021-11-27T11:37:42Z) - Cooperative and Competitive Biases for Multi-Agent Reinforcement
Learning [12.676356746752893]
Training a multi-agent reinforcement learning (MARL) algorithm is more challenging than training a single-agent reinforcement learning algorithm.
We propose an algorithm that boosts MARL training using the biased action information of other agents based on a friend-or-foe concept.
We empirically demonstrate that our algorithm outperforms existing algorithms in various mixed cooperative-competitive environments.
arXiv Detail & Related papers (2021-01-18T05:52:22Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.