On the Impossibility of Learning to Cooperate with Adaptive Partner
Strategies in Repeated Games
- URL: http://arxiv.org/abs/2206.10614v1
- Date: Mon, 20 Jun 2022 16:59:12 GMT
- Title: On the Impossibility of Learning to Cooperate with Adaptive Partner
Strategies in Repeated Games
- Authors: Robert Loftin and Frans A. Oliehoek
- Abstract summary: We show that no learning algorithm can reliably learn to cooperate with all possible adaptive partners in a repeated matrix game.
We then discuss potential alternative assumptions which capture the idea that an adaptive partner will only adapt rationally to our behavior.
- Score: 13.374518263328763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning to cooperate with other agents is challenging when those agents also
possess the ability to adapt to our own behavior. Practical and theoretical
approaches to learning in cooperative settings typically assume that other
agents' behaviors are stationary, or else make very specific assumptions about
other agents' learning processes. The goal of this work is to understand
whether we can reliably learn to cooperate with other agents without such
restrictive assumptions, which are unlikely to hold in real-world applications.
Our main contribution is a set of impossibility results, which show that no
learning algorithm can reliably learn to cooperate with all possible adaptive
partners in a repeated matrix game, even if that partner is guaranteed to
cooperate with some stationary strategy. Motivated by these results, we then
discuss potential alternative assumptions which capture the idea that an
adaptive partner will only adapt rationally to our behavior.
Related papers
- Multi-agent cooperation through learning-aware policy gradients [53.63948041506278]
Self-interested individuals often fail to cooperate, posing a fundamental challenge for multi-agent learning.
We present the first unbiased, higher-derivative-free policy gradient algorithm for learning-aware reinforcement learning.
We derive from the iterated prisoner's dilemma a novel explanation for how and when cooperation arises among self-interested learning-aware agents.
arXiv Detail & Related papers (2024-10-24T10:48:42Z) - Decentralized and Lifelong-Adaptive Multi-Agent Collaborative Learning [57.652899266553035]
Decentralized and lifelong-adaptive multi-agent collaborative learning aims to enhance collaboration among multiple agents without a central server.
We propose DeLAMA, a decentralized multi-agent lifelong collaborative learning algorithm with dynamic collaboration graphs.
arXiv Detail & Related papers (2024-03-11T09:21:11Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Tackling Cooperative Incompatibility for Zero-Shot Human-AI Coordination [36.33334853998621]
We introduce the Cooperative Open-ended LEarning (COLE) framework to solve cooperative incompatibility in learning.
COLE formulates open-ended objectives in cooperative games with two players using perspectives of graph theory to evaluate and pinpoint the cooperative capacity of each strategy.
We show that COLE could effectively overcome the cooperative incompatibility from theoretical and empirical analysis.
arXiv Detail & Related papers (2023-06-05T16:51:38Z) - Cooperative Open-ended Learning Framework for Zero-shot Coordination [35.330951448600594]
We propose a framework to construct open-ended objectives in cooperative games with two players.
We also propose a practical algorithm that leverages knowledge from game theory and graph theory.
Our method outperforms current state-of-the-art methods when coordinating with different-level partners.
arXiv Detail & Related papers (2023-02-09T18:37:04Z) - On-the-fly Strategy Adaptation for ad-hoc Agent Coordination [21.029009561094725]
Training agents in cooperative settings offers the promise of AI agents able to interact effectively with humans (and other agents) in the real world.
The vast majority of focus has been on the self-play paradigm.
This paper proposes to solve this problem by adapting agent strategies on the fly, using a posterior belief over the other agents' strategy.
arXiv Detail & Related papers (2022-03-08T02:18:11Z) - Conditional Imitation Learning for Multi-Agent Games [89.897635970366]
We study the problem of conditional multi-agent imitation learning, where we have access to joint trajectory demonstrations at training time.
We propose a novel approach to address the difficulties of scalability and data scarcity.
Our model learns a low-rank subspace over ego and partner agent strategies, then infers and adapts to a new partner strategy by interpolating in the subspace.
arXiv Detail & Related papers (2022-01-05T04:40:13Z) - Behaviour-conditioned policies for cooperative reinforcement learning
tasks [41.74498230885008]
In various real-world tasks, an agent needs to cooperate with unknown partner agent types.
Deep reinforcement learning models can be trained to deliver the required functionality but are known to suffer from sample inefficiency and slow learning.
We suggest a method, where we synthetically produce populations of agents with different behavioural patterns together with ground truth data of their behaviour.
We additionally suggest an agent architecture, which can efficiently use the generated data and gain the meta-learning capability.
arXiv Detail & Related papers (2021-10-04T09:16:41Z) - On the Critical Role of Conventions in Adaptive Human-AI Collaboration [73.21967490610142]
We propose a learning framework that teases apart rule-dependent representation from convention-dependent representation.
We experimentally validate our approach on three collaborative tasks varying in complexity.
arXiv Detail & Related papers (2021-04-07T02:46:19Z) - Deep Interactive Bayesian Reinforcement Learning via Meta-Learning [63.96201773395921]
The optimal adaptive behaviour under uncertainty over the other agents' strategies can be computed using the Interactive Bayesian Reinforcement Learning framework.
We propose to meta-learn approximate belief inference and Bayes-optimal behaviour for a given prior.
We show empirically that our approach outperforms existing methods that use a model-free approach, sample from the approximate posterior, maintain memory-free models of others, or do not fully utilise the known structure of the environment.
arXiv Detail & Related papers (2021-01-11T13:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.