Open Ad Hoc Teamwork with Cooperative Game Theory
- URL: http://arxiv.org/abs/2402.15259v5
- Date: Sun, 7 Jul 2024 12:43:35 GMT
- Title: Open Ad Hoc Teamwork with Cooperative Game Theory
- Authors: Jianhong Wang, Yang Li, Yuan Zhang, Wei Pan, Samuel Kaski,
- Abstract summary: Ad hoc teamwork poses a challenging problem, requiring the design of an agent to collaborate with teammates without prior coordination or joint training.
One promising solution is leveraging the generalizability of graph neural networks to handle an unrestricted number of agents.
We propose a novel algorithm named CIAO, based on the game's framework, with additional provable implementation tricks that can facilitate learning.
- Score: 28.605478081031215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Ad hoc teamwork poses a challenging problem, requiring the design of an agent to collaborate with teammates without prior coordination or joint training. Open ad hoc teamwork (OAHT) further complicates this challenge by considering environments with a changing number of teammates, referred to as open teams. One promising solution in practice to this problem is leveraging the generalizability of graph neural networks to handle an unrestricted number of agents with various agent-types, named graph-based policy learning (GPL). However, its joint Q-value representation over a coordination graph lacks convincing explanations. In this paper, we establish a new theory to understand the representation of the joint Q-value for OAHT and its learning paradigm, through the lens of cooperative game theory. Building on our theory, we propose a novel algorithm named CIAO, based on GPL's framework, with additional provable implementation tricks that can facilitate learning. The demos of experimental results are available on https://sites.google.com/view/ciao2024, and the code of experiments is published on https://github.com/hsvgbkhgbv/CIAO.
Related papers
- Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game.
Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z) - Cooperative Open-ended Learning Framework for Zero-shot Coordination [35.330951448600594]
We propose a framework to construct open-ended objectives in cooperative games with two players.
We also propose a practical algorithm that leverages knowledge from game theory and graph theory.
Our method outperforms current state-of-the-art methods when coordinating with different-level partners.
arXiv Detail & Related papers (2023-02-09T18:37:04Z) - PyGlove: Efficiently Exchanging ML Ideas as Code [81.80955202879686]
PyGlove represents ideas as symbolic rule-based patches, enabling researchers to write down the rules for models they have not seen.
This permits a network effect among teams: at once, any team can issue patches to all other teams.
arXiv Detail & Related papers (2023-02-03T18:52:09Z) - A General Learning Framework for Open Ad Hoc Teamwork Using Graph-based
Policy Learning [11.998708550268978]
We develop a class of solutions for open ad hoc teamwork under full and partial observability.
We show that our solution can learn efficient policies in open ad hoc teamwork in fully and partially observable cases.
arXiv Detail & Related papers (2022-10-11T13:44:44Z) - RACA: Relation-Aware Credit Assignment for Ad-Hoc Cooperation in
Multi-Agent Deep Reinforcement Learning [55.55009081609396]
We propose a novel method, called Relation-Aware Credit Assignment (RACA), which achieves zero-shot generalization in ad-hoc cooperation scenarios.
RACA takes advantage of a graph-based encoder relation to encode the topological structure between agents.
Our method outperforms baseline methods on the StarCraftII micromanagement benchmark and ad-hoc cooperation scenarios.
arXiv Detail & Related papers (2022-06-02T03:39:27Z) - Privatized Graph Federated Learning [57.14673504239551]
We introduce graph federated learning, which consists of multiple units connected by a graph.
We show how graph homomorphic perturbations can be used to ensure the algorithm is differentially private.
arXiv Detail & Related papers (2022-03-14T13:48:23Z) - Towards Collaborative Question Answering: A Preliminary Study [63.91687114660126]
We propose CollabQA, a novel QA task in which several expert agents coordinated by a moderator work together to answer questions that cannot be answered with any single agent alone.
We make a synthetic dataset of a large knowledge graph that can be distributed to experts.
We show that the problem can be challenging without introducing prior to the collaboration structure, unless experts are perfect and uniform.
arXiv Detail & Related papers (2022-01-24T14:27:00Z) - Finding Core Members of Cooperative Games using Agent-Based Modeling [0.0]
Agent-based modeling (ABM) is a powerful paradigm to gain insight into social phenomena.
In this paper, a algorithm is developed that can be embedded into an ABM to allow the agents to find coalition.
arXiv Detail & Related papers (2020-08-30T17:38:43Z) - Towards Open Ad Hoc Teamwork Using Graph-based Policy Learning [11.480994804659908]
We build on graph neural networks to learn agent models and joint-action value models under varying team compositions.
We empirically demonstrate that our approach successfully models the effects other agents have on the learner, leading to policies that robustly adapt to dynamic team compositions.
arXiv Detail & Related papers (2020-06-18T10:39:41Z) - Evaluating and Rewarding Teamwork Using Cooperative Game Abstractions [103.3630903577951]
We use cooperative game theory to study teams of artificial RL agents as well as real world teams from professional sports.
We introduce a parametric model called cooperative game abstractions (CGAs) for estimating CFs from data.
We provide identification results and sample bounds complexity for CGA models as well as error bounds in the estimation of the Shapley Value using CGAs.
arXiv Detail & Related papers (2020-06-16T22:03:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.