Similarity-based cooperative equilibrium
- URL: http://arxiv.org/abs/2211.14468v2
- Date: Sun, 12 Nov 2023 16:56:44 GMT
- Title: Similarity-based cooperative equilibrium
- Authors: Caspar Oesterheld, Johannes Treutlein, Roger Grosse, Vincent Conitzer,
Jakob Foerster
- Abstract summary: In social dilemmas like the one-shot Prisoner's Dilemma, standard game theory predicts that ML agents will fail to cooperate with each other.
We introduce a more realistic setting in which agents only observe a single number indicating how similar they are to each other.
We prove that this allows for the same set of cooperative outcomes as the full transparency setting.
- Score: 29.779551971013074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As machine learning agents act more autonomously in the world, they will
increasingly interact with each other. Unfortunately, in many social dilemmas
like the one-shot Prisoner's Dilemma, standard game theory predicts that ML
agents will fail to cooperate with each other. Prior work has shown that one
way to enable cooperative outcomes in the one-shot Prisoner's Dilemma is to
make the agents mutually transparent to each other, i.e., to allow them to
access one another's source code (Rubinstein 1998, Tennenholtz 2004) -- or
weights in the case of ML agents. However, full transparency is often
unrealistic, whereas partial transparency is commonplace. Moreover, it is
challenging for agents to learn their way to cooperation in the full
transparency setting. In this paper, we introduce a more realistic setting in
which agents only observe a single number indicating how similar they are to
each other. We prove that this allows for the same set of cooperative outcomes
as the full transparency setting. We also demonstrate experimentally that
cooperation can be learned using simple ML methods.
Related papers
- On the Complexity of Learning to Cooperate with Populations of Socially Rational Agents [17.015143707851358]
We consider the problem of cooperating with a textitpopulation of agents in a finitely-repeated, two player general-sum matrix game with private utilities.
Our results first show that these assumptions alone are insufficient to ensure textitzero-shot cooperation with members of the target population.
We provide upper and lower bounds on the number of samples needed to learn an effective cooperation strategy.
arXiv Detail & Related papers (2024-06-29T11:59:52Z) - Cooperate or Collapse: Emergence of Sustainable Cooperation in a Society of LLM Agents [101.17919953243107]
GovSim is a generative simulation platform designed to study strategic interactions and cooperative decision-making in large language models (LLMs)
We find that all but the most powerful LLM agents fail to achieve a sustainable equilibrium in GovSim, with the highest survival rate below 54%.
We show that agents that leverage "Universalization"-based reasoning, a theory of moral thinking, are able to achieve significantly better sustainability.
arXiv Detail & Related papers (2024-04-25T15:59:16Z) - Beyond Joint Demonstrations: Personalized Expert Guidance for Efficient Multi-Agent Reinforcement Learning [54.40927310957792]
We introduce a novel concept of personalized expert demonstrations, tailored for each individual agent or, more broadly, each individual type of agent within a heterogeneous team.
These demonstrations solely pertain to single-agent behaviors and how each agent can achieve personal goals without encompassing any cooperative elements.
We propose an approach that selectively utilizes personalized expert demonstrations as guidance and allows agents to learn to cooperate.
arXiv Detail & Related papers (2024-03-13T20:11:20Z) - BM2CP: Efficient Collaborative Perception with LiDAR-Camera Modalities [5.034692611033509]
We propose a collaborative perception paradigm, BM2CP, which employs LiDAR and camera to achieve efficient multi-modal perception.
It is capable to cope with the special case where one of the sensors, same or different type, of any agent is missing.
Our approach outperforms the state-of-the-art methods with 50X lower communication volumes in both simulated and real-world autonomous driving scenarios.
arXiv Detail & Related papers (2023-10-23T08:45:12Z) - Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View [60.80731090755224]
This paper probes the collaboration mechanisms among contemporary NLP systems by practical experiments with theoretical insights.
We fabricate four unique societies' comprised of LLM agents, where each agent is characterized by a specific trait' (easy-going or overconfident) and engages in collaboration with a distinct thinking pattern' (debate or reflection)
Our results further illustrate that LLM agents manifest human-like social behaviors, such as conformity and consensus reaching, mirroring social psychology theories.
arXiv Detail & Related papers (2023-10-03T15:05:52Z) - On the Impossibility of Learning to Cooperate with Adaptive Partner
Strategies in Repeated Games [13.374518263328763]
We show that no learning algorithm can reliably learn to cooperate with all possible adaptive partners in a repeated matrix game.
We then discuss potential alternative assumptions which capture the idea that an adaptive partner will only adapt rationally to our behavior.
arXiv Detail & Related papers (2022-06-20T16:59:12Z) - Cooperative Online Learning in Stochastic and Adversarial MDPs [50.62439652257712]
We study cooperative online learning in and adversarial Markov decision process (MDP)
In each episode, $m$ agents interact with an MDP simultaneously and share information in order to minimize their individual regret.
We are the first to consider cooperative reinforcement learning (RL) with either non-fresh randomness or in adversarial MDPs.
arXiv Detail & Related papers (2022-01-31T12:32:11Z) - Collective eXplainable AI: Explaining Cooperative Strategies and Agent
Contribution in Multiagent Reinforcement Learning with Shapley Values [68.8204255655161]
This study proposes a novel approach to explain cooperative strategies in multiagent RL using Shapley values.
Results could have implications for non-discriminatory decision making, ethical and responsible AI-derived decisions or policy making under fairness constraints.
arXiv Detail & Related papers (2021-10-04T10:28:57Z) - Cooperative-Competitive Reinforcement Learning with History-Dependent
Rewards [12.41853254173419]
We show that an agent's decision-making problem can be modeled as an interactive partially observable Markov decision process (I-POMDP)
We present an interactive advantage actor-critic method (IA2C$+$), which combines the independent advantage actor-critic network with a belief filter.
Empirical results show that IA2C$+$ learns the optimal policy faster and more robustly than several other baselines.
arXiv Detail & Related papers (2020-10-15T21:37:07Z) - UneVEn: Universal Value Exploration for Multi-Agent Reinforcement
Learning [53.73686229912562]
We propose a novel MARL approach called Universal Value Exploration (UneVEn)
UneVEn learns a set of related tasks simultaneously with a linear decomposition of universal successor features.
Empirical results on a set of exploration games, challenging cooperative predator-prey tasks requiring significant coordination among agents, and StarCraft II micromanagement benchmarks show that UneVEn can solve tasks where other state-of-the-art MARL methods fail.
arXiv Detail & Related papers (2020-10-06T19:08:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.