Related papers: Learning cooperative behaviours in adversarial multi-agent systems

Learning cooperative behaviours in adversarial multi-agent systems

URL: http://arxiv.org/abs/2302.05528v1
Date: Fri, 10 Feb 2023 22:12:29 GMT
Title: Learning cooperative behaviours in adversarial multi-agent systems
Authors: Ni Wang, Gautham P. Das, Alan G. Millard
Abstract summary: This work extends an existing virtual multi-agent platform called RoboSumo to create TripleSumo. We investigate a scenario in which two agents, namely Bug' and Ant', must team up and push another agent Spider' out of the arena. To tackle this goal, the newly added agent Bug' is trained during an ongoing match between Ant' and Spider'
Score: 2.355408272992293
License: http://creativecommons.org/licenses/by/4.0/
Abstract: This work extends an existing virtual multi-agent platform called RoboSumo to create TripleSumo -- a platform for investigating multi-agent cooperative behaviors in continuous action spaces, with physical contact in an adversarial environment. In this paper we investigate a scenario in which two agents, namely `Bug' and `Ant', must team up and push another agent `Spider' out of the arena. To tackle this goal, the newly added agent `Bug' is trained during an ongoing match between `Ant' and `Spider'. `Bug' must develop awareness of the other agents' actions, infer the strategy of both sides, and eventually learn an action policy to cooperate. The reinforcement learning algorithm Deep Deterministic Policy Gradient (DDPG) is implemented with a hybrid reward structure combining dense and sparse rewards. The cooperative behavior is quantitatively evaluated by the mean probability of winning the match and mean number of steps needed to win.

Related papers

Human-Agent Coordination in Games under Incomplete Information via Multi-Step Intent [21.170542003568674]
Strategic coordination between autonomous agents and human partners can be modeled as turn-based cooperative games. We extend a turn-based game under incomplete information to allow players to take multiple actions per turn rather than a single action.
arXiv Detail & Related papers (2024-10-23T19:37:19Z)
N-Agent Ad Hoc Teamwork [36.10108537776956]
Current approaches to learning cooperative multi-agent behaviors assume relatively restrictive settings. This paper formalizes the problem, and proposes the Policy Optimization with Agent Modelling (POAM) algorithm. POAM is a policy gradient, multi-agent reinforcement learning approach to the NAHT problem, that enables adaptation to diverse teammate behaviors.
arXiv Detail & Related papers (2024-04-16T17:13:08Z)
DCIR: Dynamic Consistency Intrinsic Reward for Multi-Agent Reinforcement Learning [84.22561239481901]
We propose a new approach that enables agents to learn whether their behaviors should be consistent with that of other agents. We evaluate DCIR in multiple environments including Multi-agent Particle, Google Research Football and StarCraft II Micromanagement.
arXiv Detail & Related papers (2023-12-10T06:03:57Z)
ProAgent: Building Proactive Cooperative Agents with Large Language Models [89.53040828210945]
ProAgent is a novel framework that harnesses large language models to create proactive agents. ProAgent can analyze the present state, and infer the intentions of teammates from observations. ProAgent exhibits a high degree of modularity and interpretability, making it easily integrated into various coordination scenarios.
arXiv Detail & Related papers (2023-08-22T10:36:56Z)
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors [93.38830440346783]
We propose a multi-agent framework framework that can collaboratively adjust its composition as a greater-than-the-sum-of-its-parts system. Our experiments demonstrate that framework framework can effectively deploy multi-agent groups that outperform a single agent. In view of these behaviors, we discuss some possible strategies to leverage positive ones and mitigate negative ones for improving the collaborative potential of multi-agent groups.
arXiv Detail & Related papers (2023-08-21T16:47:11Z)
Multi-agent Deep Covering Skill Discovery [50.812414209206054]
We propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space. Also, we propose a novel framework to adopt the multi-agent options in the MARL process. We show that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options.
arXiv Detail & Related papers (2022-10-07T00:40:59Z)
Cooperative and Competitive Biases for Multi-Agent Reinforcement Learning [12.676356746752893]
Training a multi-agent reinforcement learning (MARL) algorithm is more challenging than training a single-agent reinforcement learning algorithm. We propose an algorithm that boosts MARL training using the biased action information of other agents based on a friend-or-foe concept. We empirically demonstrate that our algorithm outperforms existing algorithms in various mixed cooperative-competitive environments.
arXiv Detail & Related papers (2021-01-18T05:52:22Z)
UneVEn: Universal Value Exploration for Multi-Agent Reinforcement Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn) UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features. Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z)
A Cordial Sync: Going Beyond Marginal Policies for Multi-Agent Embodied Tasks [111.34055449929487]
We introduce the novel task FurnMove in which agents work together to move a piece of furniture through a living room to a goal. Unlike existing tasks, FurnMove requires agents to coordinate at every timestep. We identify two challenges when training agents to complete FurnMove: existing decentralized action sampling procedures do not permit expressive joint action policies. Using SYNC-policies and CORDIAL, our agents achieve a 58% completion rate on FurnMove, an impressive absolute gain of 25 percentage points over competitive decentralized baselines.
arXiv Detail & Related papers (2020-07-09T17:59:57Z)
Natural Emergence of Heterogeneous Strategies in Artificially Intelligent Competitive Teams [0.0]
We develop a competitive multi agent environment called FortAttack in which two teams compete against each other. We observe a natural emergence of heterogeneous behavior amongst homogeneous agents when such behavior can lead to the team's success. We propose ensemble training, in which we utilize the evolved opponent strategies to train a single policy for friendly agents.
arXiv Detail & Related papers (2020-07-06T22:35:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.