Cooperative Open-ended Learning Framework for Zero-shot Coordination
- URL: http://arxiv.org/abs/2302.04831v4
- Date: Wed, 28 Feb 2024 23:58:20 GMT
- Title: Cooperative Open-ended Learning Framework for Zero-shot Coordination
- Authors: Yang Li, Shao Zhang, Jichen Sun, Yali Du, Ying Wen, Xinbing Wang, Wei
Pan
- Abstract summary: We propose a framework to construct open-ended objectives in cooperative games with two players.
We also propose a practical algorithm that leverages knowledge from game theory and graph theory.
Our method outperforms current state-of-the-art methods when coordinating with different-level partners.
- Score: 35.330951448600594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Zero-shot coordination in cooperative artificial intelligence (AI) remains a
significant challenge, which means effectively coordinating with a wide range
of unseen partners. Previous algorithms have attempted to address this
challenge by optimizing fixed objectives within a population to improve
strategy or behaviour diversity. However, these approaches can result in a loss
of learning and an inability to cooperate with certain strategies within the
population, known as cooperative incompatibility. To address this issue, we
propose the Cooperative Open-ended LEarning (COLE) framework, which constructs
open-ended objectives in cooperative games with two players from the
perspective of graph theory to assess and identify the cooperative ability of
each strategy. We further specify the framework and propose a practical
algorithm that leverages knowledge from game theory and graph theory.
Furthermore, an analysis of the learning process of the algorithm shows that it
can efficiently overcome cooperative incompatibility. The experimental results
in the Overcooked game environment demonstrate that our method outperforms
current state-of-the-art methods when coordinating with different-level
partners. Our demo is available at https://sites.google.com/view/cole-2023.
Related papers
- Multi-agent cooperation through learning-aware policy gradients [53.63948041506278]
Self-interested individuals often fail to cooperate, posing a fundamental challenge for multi-agent learning.
We present the first unbiased, higher-derivative-free policy gradient algorithm for learning-aware reinforcement learning.
We derive from the iterated prisoner's dilemma a novel explanation for how and when cooperation arises among self-interested learning-aware agents.
arXiv Detail & Related papers (2024-10-24T10:48:42Z) - Decision-focused Graph Neural Networks for Combinatorial Optimization [62.34623670845006]
An emerging strategy to tackle optimization problems involves the adoption of graph neural networks (GNNs) as an alternative to traditional algorithms.
Despite the growing popularity of GNNs and traditional algorithm solvers in the realm of CO, there is limited research on their integrated use and the correlation between them within an end-to-end framework.
We introduce a decision-focused framework that utilizes GNNs to address CO problems with auxiliary support.
arXiv Detail & Related papers (2024-06-05T22:52:27Z) - Graph Enhanced Reinforcement Learning for Effective Group Formation in Collaborative Problem Solving [3.392758494801288]
This study addresses the challenge of forming effective groups in collaborative problem-solving environments.
We propose a novel approach leveraging graph theory and reinforcement learning.
arXiv Detail & Related papers (2024-03-15T04:04:40Z) - Decentralized and Lifelong-Adaptive Multi-Agent Collaborative Learning [57.652899266553035]
Decentralized and lifelong-adaptive multi-agent collaborative learning aims to enhance collaboration among multiple agents without a central server.
We propose DeLAMA, a decentralized multi-agent lifelong collaborative learning algorithm with dynamic collaboration graphs.
arXiv Detail & Related papers (2024-03-11T09:21:11Z) - Aligning Individual and Collective Objectives in Multi-Agent Cooperation [18.082268221987956]
Mixed-motive cooperation is one of the most prominent challenges in multi-agent learning.
We introduce a novel optimization method named textbftextitAltruistic textbftextitGradient textbftextitAdjustment (textbftextitAgA) that employs gradient adjustments to progressively align individual and collective objectives.
We evaluate the effectiveness of our algorithm AgA through benchmark environments for testing mixed-motive collaboration with small-scale agents.
arXiv Detail & Related papers (2024-02-19T08:18:53Z) - Neural Population Learning beyond Symmetric Zero-sum Games [52.20454809055356]
We introduce NeuPL-JPSRO, a neural population learning algorithm that benefits from transfer learning of skills and converges to a Coarse Correlated (CCE) of the game.
Our work shows that equilibrium convergent population learning can be implemented at scale and in generality.
arXiv Detail & Related papers (2024-01-10T12:56:24Z) - Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination [36.33334853998621]
We introduce the Cooperative Open-ended LEarning (COLE) framework to solve cooperative incompatibility in learning.
COLE formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy.
We show that COLE could effectively overcome the cooperative incompatibility from theoretical and empirical analysis.
arXiv Detail & Related papers (2023-06-05T16:51:38Z) - A Reinforcement Learning-assisted Genetic Programming Algorithm for Team
Formation Problem Considering Person-Job Matching [70.28786574064694]
A reinforcement learning-assisted genetic programming algorithm (RL-GP) is proposed to enhance the quality of solutions.
The hyper-heuristic rules obtained through efficient learning can be utilized as decision-making aids when forming project teams.
arXiv Detail & Related papers (2023-04-08T14:32:12Z) - Any-Play: An Intrinsic Augmentation for Zero-Shot Coordination [0.4153433779716327]
We formalize an alternative criteria for evaluating cooperative AI, referred to as inter-algorithm cross-play.
We show that existing state-of-the-art cooperative AI algorithms, such as Other-Play and Off-Belief Learning, under-perform in this paradigm.
We propose the Any-Play learning augmentation for generalizing self-play-based algorithms to the inter-algorithm cross-play setting.
arXiv Detail & Related papers (2022-01-28T21:43:58Z) - Finding Core Members of Cooperative Games using Agent-Based Modeling [0.0]
Agent-based modeling (ABM) is a powerful paradigm to gain insight into social phenomena.
In this paper, a algorithm is developed that can be embedded into an ABM to allow the agents to find coalition.
arXiv Detail & Related papers (2020-08-30T17:38:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.