Paying to Do Better: Games with Payments between Learning Agents
- URL: http://arxiv.org/abs/2405.20880v1
- Date: Fri, 31 May 2024 14:55:11 GMT
- Title: Paying to Do Better: Games with Payments between Learning Agents
- Authors: Yoav Kolumbus, Joe Halpern, Éva Tardos,
- Abstract summary: In repeated games, such as auctions, players typically use learning algorithms to choose their actions.
In this paper, we explore the impact of players incorporating monetary transfers into their agents' algorithms.
We propose a simple game-theoretic model to capture such scenarios.
- Score: 4.067193517689939
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In repeated games, such as auctions, players typically use learning algorithms to choose their actions. The use of such autonomous learning agents has become widespread on online platforms. In this paper, we explore the impact of players incorporating monetary transfers into their agents' algorithms, aiming to incentivize behavior in their favor. Our focus is on understanding when players have incentives to make use of monetary transfers, how these payments affect learning dynamics, and what the implications are for welfare and its distribution among the players. We propose a simple game-theoretic model to capture such scenarios. Our results on general games show that in a broad class of games, players benefit from letting their learning agents make payments to other learners during the game dynamics, and that in many cases, this kind of behavior improves welfare for all players. Our results on first- and second-price auctions show that in equilibria of the ``payment policy game,'' the agents' dynamics can reach strong collusive outcomes with low revenue for the auctioneer. These results highlight a challenge for mechanism design in systems where automated learning agents can benefit from interacting with their peers outside the boundaries of the mechanism.
Related papers
- Reciprocal Reward Influence Encourages Cooperation From Self-Interested Agents [2.1301560294088318]
Cooperation between self-interested individuals is a widespread phenomenon in the natural world, but remains elusive in interactions between artificially intelligent agents.
We introduce Reciprocators, reinforcement learning agents which are intrinsically motivated to reciprocate the influence of opponents' actions on their returns.
We show that Reciprocators can be used to promote cooperation in temporally extended social dilemmas during simultaneous learning.
arXiv Detail & Related papers (2024-06-03T06:07:27Z) - Incentivized Learning in Principal-Agent Bandit Games [62.41639598376539]
This work considers a repeated principal-agent bandit game, where the principal can only interact with her environment through the agent.
The principal can influence the agent's decisions by offering incentives which add up to his rewards.
We present nearly optimal learning algorithms for the principal's regret in both multi-armed and linear contextual settings.
arXiv Detail & Related papers (2024-03-06T16:00:46Z) - Impact of Decentralized Learning on Player Utilities in Stackelberg Games [57.08270857260131]
In many two-agent systems, each agent learns separately and the rewards of the two agents are not perfectly aligned.
We model these systems as Stackelberg games with decentralized learning and show that standard regret benchmarks result in worst-case linear regret for at least one player.
We develop algorithms to achieve near-optimal $O(T2/3)$ regret for both players with respect to these benchmarks.
arXiv Detail & Related papers (2024-02-29T23:38:28Z) - Improving Language Model Negotiation with Self-Play and In-Context
Learning from AI Feedback [97.54519989641388]
We study whether multiple large language models (LLMs) can autonomously improve each other in a negotiation game by playing, reflecting, and criticizing.
Only a subset of the language models we consider can self-play and improve the deal price from AI feedback.
arXiv Detail & Related papers (2023-05-17T11:55:32Z) - How Bad is Selfish Driving? Bounding the Inefficiency of Equilibria in
Urban Driving Games [64.71476526716668]
We study the (in)efficiency of any equilibrium players might agree to play.
We obtain guarantees that refine existing bounds on the Price of Anarchy.
Although the obtained guarantees concern open-loop trajectories, we observe efficient equilibria even when agents employ closed-loop policies.
arXiv Detail & Related papers (2022-10-24T09:32:40Z) - Player Modeling using Behavioral Signals in Competitive Online Games [4.168733556014873]
This paper focuses on the importance of addressing different aspects of playing behavior when modeling players for creating match-ups.
We engineer several behavioral features from a dataset of over 75,000 battle royale matches and create player models.
We then use the created models to predict ranks for different groups of players in the data.
arXiv Detail & Related papers (2021-11-29T22:53:17Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z) - Learning from Learners: Adapting Reinforcement Learning Agents to be
Competitive in a Card Game [71.24825724518847]
We present a study on how popular reinforcement learning algorithms can be adapted to learn and to play a real-world implementation of a competitive multiplayer card game.
We propose specific training and validation routines for the learning agents, in order to evaluate how the agents learn to be competitive and explain how they adapt to each others' playing style.
arXiv Detail & Related papers (2020-04-08T14:11:05Z) - Deep Reinforcement Learning for FlipIt Security Game [2.0624765454705654]
We describe a deep learning model in which agents adapt to different classes of opponents and learn the optimal counter-strategy.
We apply our model to FlipIt, a two-player security game in which both players, the attacker and the defender, compete for ownership of a shared resource.
Our model is a deep neural network combined with Q-learning and is trained to maximize the defender's time of ownership of the resource.
arXiv Detail & Related papers (2020-02-28T18:26:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.