Related papers: Decentralized and Uncoordinated Learning of Stable Matchings: A Game-Theoretic Approach

Decentralized and Uncoordinated Learning of Stable Matchings: A Game-Theoretic Approach

URL: http://arxiv.org/abs/2407.21294v2
Date: Thu, 15 Aug 2024 02:57:09 GMT
Title: Decentralized and Uncoordinated Learning of Stable Matchings: A Game-Theoretic Approach
Authors: S. Rasoul Etesami, R. Srikant,
Abstract summary: We consider the problem of learning stable matchings with unknown preferences in a decentralized and uncoordinated manner. We show that applying the exponential weight (EXP) learning algorithm to the stable matching game achieves logarithmic regret in a fully decentralized and uncoordinated fashion. We also introduce another decentralized and uncoordinated learning algorithm that globally converges to a stable matching with arbitrarily high probability.
Score: 9.376820789668304
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the problem of learning stable matchings with unknown preferences in a decentralized and uncoordinated manner, where "decentralized" means that players make decisions individually without the influence of a central platform, and "uncoordinated" means that players do not need to synchronize their decisions using pre-specified rules. First, we provide a game formulation for this problem with known preferences, where the set of pure Nash equilibria (NE) coincides with the set of stable matchings, and mixed NE can be rounded to a stable matching. Then, we show that for hierarchical markets, applying the exponential weight (EXP) learning algorithm to the stable matching game achieves logarithmic regret in a fully decentralized and uncoordinated fashion. Moreover, we show that EXP converges locally and exponentially fast to a stable matching in general markets. We also introduce another decentralized and uncoordinated learning algorithm that globally converges to a stable matching with arbitrarily high probability. Finally, we provide stronger feedback conditions under which it is possible to drive the market faster toward an approximate stable matching. Our proposed game-theoretic framework bridges the discrete problem of learning stable matchings with the problem of learning NE in continuous-action games.

Related papers

Competing Bandits in Matching Markets via Super Stability [3.846293944458337]
We study bandit learning in matching markets with two-sided reward uncertainty.<n>We demonstrate the advantage of the Extended Gale-Shapley (GS) algorithm over the standard GS algorithm in achieving true stable matchings.
arXiv Detail & Related papers (2025-06-19T00:03:20Z)
Probably Correct Optimal Stable Matching for Two-Sided Markets Under Uncertainty [5.250288418639076]
We consider a learning problem for the stable marriage model under unknown preferences for the left side of the market. Our aim is to quickly identify the stable matching that is left-side optimal, rendering this a pure exploration problem with bandit feedback.
arXiv Detail & Related papers (2025-01-06T13:59:57Z)
Competing Bandits in Decentralized Contextual Matching Markets [13.313881962771777]
We study decentralized learning in two-sided matching markets where the demand side (aka players or agents) competes for the supply side (aka arms)<n>Motivated by the linear contextual bandit framework, we assume that for each agent, an arm-mean may be represented by a linear function of a known feature vector and an unknown (agent-specific) parameter.<n>We propose learning algorithms to identify the latent environment and obtain stable matchings simultaneously.
arXiv Detail & Related papers (2024-11-18T18:08:05Z)
The equivalence of dynamic and strategic stability under regularized learning in games [33.74394172275373]
We examine the long-run behavior of regularized, no-regret learning in finite games. We obtain an equivalence between strategic and dynamic stability. We show that methods based on entropic regularization converge at a geometric rate.
arXiv Detail & Related papers (2023-11-04T14:07:33Z)
Hardness of Independent Learning and Sparse Equilibrium Computation in Markov Games [70.19141208203227]
We consider the problem of decentralized multi-agent reinforcement learning in Markov games. We show that no algorithm attains no-regret in general-sum games when executed independently by all players. We show that our lower bounds hold even for seemingly easier setting in which all agents are controlled by a centralized algorithm.
arXiv Detail & Related papers (2023-03-22T03:28:12Z)
Finding mixed-strategy equilibria of continuous-action games without gradients using randomized policy networks [83.28949556413717]
We study the problem of computing an approximate Nash equilibrium of continuous-action game without access to gradients. We model players' strategies using artificial neural networks. This paper is the first to solve general continuous-action games with unrestricted mixed strategies and without any gradient information.
arXiv Detail & Related papers (2022-11-29T05:16:41Z)
Efficiently Computing Nash Equilibria in Adversarial Team Markov Games [19.717850955051837]
We introduce a class of games in which a team identically players is competing against an adversarial player. This setting allows for a unifying treatment of zero-sum Markov games potential games. Our main contribution is the first algorithm for computing stationary $epsilon$-approximate Nash equilibria in adversarial team Markov games.
arXiv Detail & Related papers (2022-08-03T16:41:01Z)
Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets [151.03738099494765]
We study a Markov matching market involving a planner and a set of strategic agents on the two sides of the market. We propose a reinforcement learning framework that integrates optimistic value iteration with maximum weight matching. We prove that the algorithm achieves sublinear regret.
arXiv Detail & Related papers (2022-03-07T19:51:25Z)
Optimal Algorithms for Decentralized Stochastic Variational Inequalities [113.43047601775453]
This work concentrates on the decentralized setting, which is increasingly important but not well understood. We present lower bounds for both communication and local iterations and construct optimal algorithms that match these lower bounds. Our algorithms are the best among the available not only in the decentralized case, but also in the deterministic and non-distributed literature.
arXiv Detail & Related papers (2022-02-06T13:14:02Z)
Learning Equilibria in Matching Markets from Bandit Feedback [139.29934476625488]
We develop a framework and algorithms for learning stable market outcomes under uncertainty. Our work takes a first step toward elucidating when and how stable matchings arise in large, data-driven marketplaces.
arXiv Detail & Related papers (2021-08-19T17:59:28Z)
Optimality and Stability in Federated Learning: A Game-theoretic Approach [0.0]
We motivate and define a notion of optimality given by the average error rates among federating agents (players) First, we show that for some regions of parameter space, all stable arrangements are optimal (Price of Anarchy equal to 1). However, we show this is not true for all settings: there exist examples of stable arrangements with higher cost than optimal (Price of Anarchy greater than 1).
arXiv Detail & Related papers (2021-06-17T15:03:51Z)
Survival of the strictest: Stable and unstable equilibria under regularized learning with partial information [32.384868685390906]
We examine the Nash equilibrium convergence properties of no-regret learning in general N-player games. We establish a comprehensive equivalence between the stability of a Nash equilibrium and its support. It provides a clear refinement criterion for the prediction of the day-to-day behavior of no-regret learning in games.
arXiv Detail & Related papers (2021-01-12T18:55:11Z)
Bandit Learning in Decentralized Matching Markets [82.39061186055775]
We study two-sided matching markets in which one side of the market (the players) does not have a priori knowledge about its preferences for the other side (the arms) and is required to learn its preferences from experience. This model extends the standard multi-armed bandit framework to a decentralized multiple player setting with competition. We show that the algorithm is incentive compatible whenever the arms' preferences are shared, but not necessarily so when preferences are fully general.
arXiv Detail & Related papers (2020-12-14T08:58:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.