Dilution, Diffusion and Symbiosis in Spatial Prisoner's Dilemma with Reinforcement Learning
- URL: http://arxiv.org/abs/2507.02211v2
- Date: Fri, 04 Jul 2025 13:32:01 GMT
- Title: Dilution, Diffusion and Symbiosis in Spatial Prisoner's Dilemma with Reinforcement Learning
- Authors: Gustavo C. Mangold, Heitor C. M. Fernandes, Mendeli H. Vainstein,
- Abstract summary: We study the effects of dilution and mobility in the spatial version of the prisoner's dilemma.<n>We show that games with fixed update rules can be qualitatively equivalent to those with learned ones.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent studies in the spatial prisoner's dilemma games with reinforcement learning have shown that static agents can learn to cooperate through a diverse sort of mechanisms, including noise injection, different types of learning algorithms and neighbours' payoff knowledge. In this work, using an independent multi-agent Q-learning algorithm, we study the effects of dilution and mobility in the spatial version of the prisoner's dilemma. Within this setting, different possible actions for the algorithm are defined, connecting with previous results on the classical, non-reinforcement learning spatial prisoner's dilemma, showcasing the versatility of the algorithm in modeling different game-theoretical scenarios and the benchmarking potential of this approach. As a result, a range of effects is observed, including evidence that games with fixed update rules can be qualitatively equivalent to those with learned ones, as well as the emergence of a symbiotic mutualistic effect between populations that forms when multiple actions are defined.
Related papers
- Explaining and Mitigating the Modality Gap in Contrastive Multimodal Learning [7.412307614007383]
Multimodal learning models are designed to bridge different modalities, such as images and text, by learning a shared representation space.<n>These models often exhibit a modality gap, where different modalities occupy distinct regions within the shared representation space.<n>We identify the critical roles of mismatched data pairs and a learnable temperature parameter in causing and perpetuating the modality gap during training.
arXiv Detail & Related papers (2024-12-10T20:36:49Z) - Reinforcement Learning, Collusion, and the Folk Theorem [0.0]
We explore the behaviour emerging from learning agents repeatedly interacting strategically for a wide range of learning dynamics.
We consider the setting of a general repeated game with finite recall, for different forms of monitoring.
We obtain a Folk Theorem-like result and characterise the set of payoff vectors that can be obtained by these dynamics.
arXiv Detail & Related papers (2024-11-19T18:45:55Z) - Multi-agent cooperation through learning-aware policy gradients [53.63948041506278]
Self-interested individuals often fail to cooperate, posing a fundamental challenge for multi-agent learning.<n>We present the first unbiased, higher-derivative-free policy gradient algorithm for learning-aware reinforcement learning.<n>We derive from the iterated prisoner's dilemma a novel explanation for how and when cooperation arises among self-interested learning-aware agents.
arXiv Detail & Related papers (2024-10-24T10:48:42Z) - Regularized Neural Ensemblers [55.15643209328513]
In this study, we explore employing regularized neural networks as ensemble methods.<n>Motivated by the risk of learning low-diversity ensembles, we propose regularizing the ensembling model by randomly dropping base model predictions.<n>We demonstrate this approach provides lower bounds for the diversity within the ensemble, reducing overfitting and improving generalization capabilities.
arXiv Detail & Related papers (2024-10-06T15:25:39Z) - Mutation-Bias Learning in Games [1.743685428161914]
We present two variants of a multi-agent reinforcement learning algorithm based on evolutionary game theoretic considerations.
One variant enables us to prove results on its relationship to a system of ordinary differential equations of replicator-mutator dynamics type.
The more complicated variant enables comparisons to Q-learning based algorithms.
arXiv Detail & Related papers (2024-05-28T14:02:44Z) - Multi-Agent Bandit Learning through Heterogeneous Action Erasure Channels [21.860440468189044]
Multi-Armed Bandit (MAB) systems are witnessing an upswing in applications within multi-agent distributed environments.
In such settings, communication between agents executing actions and the primary learner making decisions can hinder the learning process.
We introduce novel algorithms that enable learners to interact concurrently with distributed agents across heterogeneous action erasure channels.
arXiv Detail & Related papers (2023-12-21T19:21:19Z) - Safe Multi-agent Learning via Trapping Regions [89.24858306636816]
We apply the concept of trapping regions, known from qualitative theory of dynamical systems, to create safety sets in the joint strategy space for decentralized learning.
We propose a binary partitioning algorithm for verification that candidate sets form trapping regions in systems with known learning dynamics, and a sampling algorithm for scenarios where learning dynamics are not known.
arXiv Detail & Related papers (2023-02-27T14:47:52Z) - A unified stochastic approximation framework for learning in games [82.74514886461257]
We develop a flexible approximation framework for analyzing the long-run behavior of learning in games (both continuous and finite)
The proposed analysis template incorporates a wide array of popular learning algorithms, including gradient-based methods, exponential/multiplicative weights for learning in finite games, optimistic and bandit variants of the above, etc.
arXiv Detail & Related papers (2022-06-08T14:30:38Z) - Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm.
Our model factorizes the source and target data into distinct multi-layer feature spaces.
A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z) - Behavior Priors for Efficient Reinforcement Learning [97.81587970962232]
We consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors.
We discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) and mutual information and curiosity based objectives.
We demonstrate the effectiveness of our framework by applying it to a range of simulated continuous control domains.
arXiv Detail & Related papers (2020-10-27T13:17:18Z) - Towards Closing the Sim-to-Real Gap in Collaborative Multi-Robot Deep
Reinforcement Learning [0.06554326244334865]
We analyze how multi-agent reinforcement learning can bridge the gap to reality in distributed multi-robot systems.
We introduce the effect of sensing, calibration, and accuracy mismatches in distributed reinforcement learning.
We discuss on how both the different types of perturbances and how the number of agents experiencing those perturbances affect the collaborative learning effort.
arXiv Detail & Related papers (2020-08-18T11:57:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.