Offsetting Unequal Competition through RL-assisted Incentive Schemes
- URL: http://arxiv.org/abs/2201.01450v1
- Date: Wed, 5 Jan 2022 04:47:22 GMT
- Title: Offsetting Unequal Competition through RL-assisted Incentive Schemes
- Authors: Paramita Koley, Aurghya Maiti, Sourangshu Bhattacharya, and Niloy
Ganguly
- Abstract summary: This paper investigates the dynamics of competition among organizations with unequal expertise.
We design Touch-Mark, a game based on well-known multi-agent-particle-environment.
- Score: 18.57907480363166
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper investigates the dynamics of competition among organizations with
unequal expertise. Multi-agent reinforcement learning has been used to simulate
and understand the impact of various incentive schemes designed to offset such
inequality. We design Touch-Mark, a game based on well-known
multi-agent-particle-environment, where two teams (weak, strong) with unequal
but changing skill levels compete against each other. For training such a game,
we propose a novel controller assisted multi-agent reinforcement learning
algorithm \our\, which empowers each agent with an ensemble of policies along
with a supervised controller that by selectively partitioning the sample space,
triggers intelligent role division among the teammates. Using C-MADDPG as an
underlying framework, we propose an incentive scheme for the weak team such
that the final rewards of both teams become the same. We find that in spite of
the incentive, the final reward of the weak team falls short of the strong
team. On inspecting, we realize that an overall incentive scheme for the weak
team does not incentivize the weaker agents within that team to learn and
improve. To offset this, we now specially incentivize the weaker player to
learn and as a result, observe that the weak team beyond an initial phase
performs at par with the stronger team. The final goal of the paper has been to
formulate a dynamic incentive scheme that continuously balances the reward of
the two teams. This is achieved by devising an incentive scheme enriched with
an RL agent which takes minimum information from the environment.
Related papers
- Transformer Guided Coevolution: Improved Team Formation in Multiagent Adversarial Games [1.2338485391170533]
We propose an algorithm that uses a transformer-based deep neural network with Masked Language Model training to select the best team of players from a trained population.
We test our algorithm in the multiagent adversarial game Marine Capture-The-Flag, and we find that BERTeam learns non-trivial team compositions that perform well against unseen opponents.
arXiv Detail & Related papers (2024-10-17T17:06:41Z) - Multi-Agent Training for Pommerman: Curriculum Learning and Population-based Self-Play Approach [11.740631954398292]
Pommerman is an ideal benchmark for multi-agent training, providing a battleground for two teams with communication capabilities among allied agents.
This study introduces a system designed to train multi-agent systems to play Pommerman using a combination of curriculum learning and population-based self-play.
arXiv Detail & Related papers (2024-06-30T11:14:29Z) - Robust and Performance Incentivizing Algorithms for Multi-Armed Bandits
with Strategic Agents [57.627352949446625]
We consider a variant of the multi-armed bandit problem.
Specifically, the arms are strategic agents who can improve their rewards or absorb them.
We identify a class of MAB algorithms which satisfy a collection of properties and show that they lead to mechanisms that incentivize top level performance at equilibrium.
arXiv Detail & Related papers (2023-12-13T06:54:49Z) - Benchmarking Robustness and Generalization in Multi-Agent Systems: A
Case Study on Neural MMO [50.58083807719749]
We present the results of the second Neural MMO challenge, hosted at IJCAI 2022, which received 1600+ submissions.
This competition targets robustness and generalization in multi-agent systems.
We will open-source our benchmark including the environment wrapper, baselines, a visualization tool, and selected policies for further research.
arXiv Detail & Related papers (2023-08-30T07:16:11Z) - Cooperation or Competition: Avoiding Player Domination for Multi-Target
Robustness via Adaptive Budgets [76.20705291443208]
We view adversarial attacks as a bargaining game in which different players negotiate to reach an agreement on a joint direction of parameter updating.
We design a novel framework that adjusts the budgets of different adversaries to avoid any player dominance.
Experiments on standard benchmarks show that employing the proposed framework to the existing approaches significantly advances multi-target robustness.
arXiv Detail & Related papers (2023-06-27T14:02:10Z) - Neural Payoff Machines: Predicting Fair and Stable Payoff Allocations
Among Team Members [13.643650155415484]
We show how cooperative game-theoretic solutions can be distilled into a learned model by training neural networks.
Our approach creates models that can generalize to games far from the training distribution.
An important application of our framework is Explainable AI.
arXiv Detail & Related papers (2022-08-18T12:33:09Z) - Explore and Control with Adversarial Surprise [78.41972292110967]
Reinforcement learning (RL) provides a framework for learning goal-directed policies given user-specified rewards.
We propose a new unsupervised RL technique based on an adversarial game which pits two policies against each other to compete over the amount of surprise an RL agent experiences.
We show that our method leads to the emergence of complex skills by exhibiting clear phase transitions.
arXiv Detail & Related papers (2021-07-12T17:58:40Z) - Coach-Player Multi-Agent Reinforcement Learning for Dynamic Team
Composition [88.26752130107259]
In real-world multiagent systems, agents with different capabilities may join or leave without altering the team's overarching goals.
We propose COPA, a coach-player framework to tackle this problem.
We 1) adopt the attention mechanism for both the coach and the players; 2) propose a variational objective to regularize learning; and 3) design an adaptive communication method to let the coach decide when to communicate with the players.
arXiv Detail & Related papers (2021-05-18T17:27:37Z) - Multi-Agent Coordination in Adversarial Environments through Signal
Mediated Strategies [37.00818384785628]
Team members can coordinate their strategies before the beginning of the game, but are unable to communicate during the playing phase of the game.
In this setting, model-free RL methods are oftentimes unable to capture coordination because agents' policies are executed in a decentralized fashion.
We show convergence to coordinated equilibria in cases where previous state-of-the-art multi-agent RL algorithms did not.
arXiv Detail & Related papers (2021-02-09T18:44:16Z) - Multi-Agent Collaboration via Reward Attribution Decomposition [75.36911959491228]
We propose Collaborative Q-learning (CollaQ) that achieves state-of-the-art performance in the StarCraft multi-agent challenge.
CollaQ is evaluated on various StarCraft Attribution maps and shows that it outperforms existing state-of-the-art techniques.
arXiv Detail & Related papers (2020-10-16T17:42:11Z) - Natural Emergence of Heterogeneous Strategies in Artificially
Intelligent Competitive Teams [0.0]
We develop a competitive multi agent environment called FortAttack in which two teams compete against each other.
We observe a natural emergence of heterogeneous behavior amongst homogeneous agents when such behavior can lead to the team's success.
We propose ensemble training, in which we utilize the evolved opponent strategies to train a single policy for friendly agents.
arXiv Detail & Related papers (2020-07-06T22:35:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.