Markets with Heterogeneous Agents: Dynamics and Survival of Bayesian vs. No-Regret Learners
- URL: http://arxiv.org/abs/2502.08597v2
- Date: Wed, 25 Jun 2025 18:09:48 GMT
- Title: Markets with Heterogeneous Agents: Dynamics and Survival of Bayesian vs. No-Regret Learners
- Authors: David Easley, Yoav Kolumbus, Eva Tardos,
- Abstract summary: We analyze the performance of heterogeneous learning agents in asset markets with payoffs.<n>Surprisingly, we find that low regret is not sufficient for survival.<n>No-regret learning requires less knowledge of the environment and is therefore more robust.
- Score: 3.985264439635754
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We analyze the performance of heterogeneous learning agents in asset markets with stochastic payoffs. Our main focus is on comparing Bayesian learners and no-regret learners who compete in markets and identifying the conditions under which each approach is more effective. Surprisingly, we find that low regret is not sufficient for survival: an agent can have regret as low as $O(\log T)$ but still vanish when competing against a Bayesian with a finite prior and any positive prior probability on the correct model. On the other hand, we show that Bayesian learning is fragile, while no-regret learning requires less knowledge of the environment and is therefore more robust. Motivated by the strengths and weaknesses of both approaches, we propose a balanced strategy for utilizing Bayesian updates that improves robustness and adaptability to distribution shifts, providing a step toward a best-of-both-worlds learning approach. The method is general, efficient, and easy to implement. Finally, we formally establish the relationship between the notions of survival and market dominance studied in economics and the framework of regret minimization, thus bridging these theories. More broadly, our work contributes to the understanding of dynamics with heterogeneous types of learning agents and their impact on markets.
Related papers
- Conservative classifiers do consistently well with improving agents: characterizing statistical and online learning [7.857499581522375]
We characterize so-called learnability with improvements across multiple new axes.<n>We show how to learn in more challenging settings, achieving lower generalization error under well-studied bounded noise models.<n>We resolve open questions posed by Attias et al. for both proper and improper learning.
arXiv Detail & Related papers (2025-06-05T17:13:59Z) - Deviations from the Nash equilibrium and emergence of tacit collusion in a two-player optimal execution game with reinforcement learning [0.9208007322096533]
We study a scenario in which two autonomous agents learn to liquidate the same asset optimally in the presence of market impact.
Our results show that the strategies learned by the agents deviate significantly from the Nash equilibrium of the corresponding market impact game.
We explore how different levels of market volatility influence the agents' performance and the equilibria they discover.
arXiv Detail & Related papers (2024-08-21T16:54:53Z) - Multi-Agent Imitation Learning: Value is Easy, Regret is Hard [52.31989962031179]
We study a multi-agent imitation learning (MAIL) problem where we take the perspective of a learner attempting to coordinate a group of agents.
Most prior work in MAIL essentially reduces the problem to matching the behavior of the expert within the support of the demonstrations.
While doing so is sufficient to drive the value gap between the learner and the expert to zero under the assumption that agents are non-strategic, it does not guarantee to deviations by strategic agents.
arXiv Detail & Related papers (2024-06-06T16:18:20Z) - No-Regret Learning in Bilateral Trade via Global Budget Balance [29.514323697659613]
We provide the first no-regret algorithms for adversarial bilateral trade under various feedback models.
We show that in the full-feedback model, the learner can guarantee $tilde O(sqrtT)$ regret against the best fixed prices in hindsight.
We also provide a learning algorithm guaranteeing a $tilde O(T3/4)$ regret upper bound with one-bit feedback.
arXiv Detail & Related papers (2023-10-18T22:34:32Z) - Anytime Model Selection in Linear Bandits [61.97047189786905]
We develop ALEXP, which has an exponentially improved dependence on $M$ for its regret.
Our approach utilizes a novel time-uniform analysis of the Lasso, establishing a new connection between online learning and high-dimensional statistics.
arXiv Detail & Related papers (2023-07-24T15:44:30Z) - A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning [53.83345471268163]
We investigate learning the equilibria in non-stationary multi-agent systems.
We show how to test for various types of equilibria by a black-box reduction to single-agent learning.
arXiv Detail & Related papers (2023-06-12T23:48:24Z) - MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning [62.065503126104126]
We study how a principal can efficiently and effectively intervene on the rewards of a previously unseen learning agent in order to induce desirable outcomes.
This is relevant to many real-world settings like auctions or taxation, where the principal may not know the learning behavior nor the rewards of real people.
We introduce MERMAIDE, a model-based meta-learning framework to train a principal that can quickly adapt to out-of-distribution agents.
arXiv Detail & Related papers (2023-04-10T15:44:50Z) - Fundamental Bounds on Online Strategic Classification [13.442155854812528]
We show that no deterministic algorithm can achieve a mistake bound $o(Delta)$ in the strategic setting.
We also extend this to the agnostic setting and obtain an algorithm with a $Delta$ multiplicative regret.
We design randomized algorithms that achieve sublinear regret bounds against both oblivious and adaptive adversaries.
arXiv Detail & Related papers (2023-02-23T22:39:43Z) - Bandit Social Learning: Exploration under Myopic Behavior [54.767961587919075]
We study social learning dynamics motivated by reviews on online platforms.
Agents collectively follow a simple multi-armed bandit protocol, but each agent acts myopically, without regards to exploration.
We derive stark learning failures for any such behavior, and provide matching positive results.
arXiv Detail & Related papers (2023-02-15T01:57:57Z) - Efficient Model-based Multi-agent Reinforcement Learning via Optimistic
Equilibrium Computation [93.52573037053449]
H-MARL (Hallucinated Multi-Agent Reinforcement Learning) learns successful equilibrium policies after a few interactions with the environment.
We demonstrate our approach experimentally on an autonomous driving simulation benchmark.
arXiv Detail & Related papers (2022-03-14T17:24:03Z) - The Best of Both Worlds: Reinforcement Learning with Logarithmic Regret
and Policy Switches [84.54669549718075]
We study the problem of regret minimization for episodic Reinforcement Learning (RL)
We focus on learning with general function classes and general model classes.
We show that a logarithmic regret bound is realizable by algorithms with $O(log T)$ switching cost.
arXiv Detail & Related papers (2022-03-03T02:55:55Z) - Finding General Equilibria in Many-Agent Economic Simulations Using Deep
Reinforcement Learning [72.23843557783533]
We show that deep reinforcement learning can discover stable solutions that are epsilon-Nash equilibria for a meta-game over agent types.
Our approach is more flexible and does not need unrealistic assumptions, e.g., market clearing.
We demonstrate our approach in real-business-cycle models, a representative family of DGE models, with 100 worker-consumers, 10 firms, and a government who taxes and redistributes.
arXiv Detail & Related papers (2022-01-03T17:00:17Z) - Learning Equilibria in Matching Markets from Bandit Feedback [139.29934476625488]
We develop a framework and algorithms for learning stable market outcomes under uncertainty.
Our work takes a first step toward elucidating when and how stable matchings arise in large, data-driven marketplaces.
arXiv Detail & Related papers (2021-08-19T17:59:28Z) - Robust Risk-Sensitive Reinforcement Learning Agents for Trading Markets [23.224860573461818]
Trading markets represent a real-world financial application to deploy reinforcement learning agents.
Our work is the first one extending empirical game theory analysis for multi-agent learning by considering risk-sensitive payoffs.
arXiv Detail & Related papers (2021-07-16T19:15:13Z) - Dynamic Pricing and Learning under the Bass Model [16.823029377470366]
We develop an algorithm that satisfies a high probability regret guarantee of order $tilde O(m2/3)$; where the market size $m$ is known a priori.
Unlike most regret analysis results, in the present problem the market size $m$ is the fundamental driver of the complexity.
arXiv Detail & Related papers (2021-03-09T03:27:33Z) - Learning Strategies in Decentralized Matching Markets under Uncertain
Preferences [91.3755431537592]
We study the problem of decision-making in the setting of a scarcity of shared resources when the preferences of agents are unknown a priori.
Our approach is based on the representation of preferences in a reproducing kernel Hilbert space.
We derive optimal strategies that maximize agents' expected payoffs.
arXiv Detail & Related papers (2020-10-29T03:08:22Z) - Hedging using reinforcement learning: Contextual $k$-Armed Bandit versus
$Q$-learning [0.22940141855172028]
We study the construction of replication strategies for contingent claims in the presence of risk and market friction.
In this article, the hedging problem is viewed as an instance of a risk-averse contextual $k$-armed bandit problem.
We find that the $k$-armed bandit model naturally fits to the Profit and Loss formulation of hedging.
arXiv Detail & Related papers (2020-07-03T11:34:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.