Related papers: Regret, stability, and fairness in matching markets with bandit learners

Regret, stability, and fairness in matching markets with bandit learners

URL: http://arxiv.org/abs/2102.06246v1
Date: Thu, 11 Feb 2021 20:18:12 GMT
Title: Regret, stability, and fairness in matching markets with bandit learners
Authors: Sarah H. Cen and Devavrat Shah
Abstract summary: We consider the two-sided matching market with bandit learners. We show that it is possible to simultaneously guarantee four desiderata: incentive compatibility, i.e., stability, low regret, i.e., $O(log(T))$ optimal regret, (3) fairness in the distribution of regret among agents, and (4) high social welfare.
Score: 22.76773950247833
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We consider the two-sided matching market with bandit learners. In the standard matching problem, users and providers are matched to ensure incentive compatibility via the notion of stability. However, contrary to the core assumption of the matching problem, users and providers do not know their true preferences a priori and must learn them. To address this assumption, recent works propose to blend the matching and multi-armed bandit problems. They establish that it is possible to assign matchings that are stable (i.e., incentive-compatible) at every time step while also allowing agents to learn enough so that the system converges to matchings that are stable under the agents' true preferences. However, while some agents may incur low regret under these matchings, others can incur high regret -- specifically, $\Omega(T)$ optimal regret where $T$ is the time horizon. In this work, we incorporate costs and transfers in the two-sided matching market with bandit learners in order to faithfully model competition between agents. We prove that, under our framework, it is possible to simultaneously guarantee four desiderata: (1) incentive compatibility, i.e., stability, (2) low regret, i.e., $O(\log(T))$ optimal regret, (3) fairness in the distribution of regret among agents, and (4) high social welfare.

Related papers

Competing Bandits in Decentralized Large Contextual Matching Markets [13.313881962771777]
We study decentralized learning in two-sided matching markets where the demand side (aka players or agents) competes for a large' supply side (aka arms) Our proposed algorithms achieve instance-dependent logarithmic regret, scaling independently of the number of arms, $K$.
arXiv Detail & Related papers (2024-11-18T18:08:05Z)
Putting Gale & Shapley to Work: Guaranteeing Stability Through Learning [14.448192914855674]
Two-sided matching markets describe a class of problems wherein participants from one side of the market must be matched to those from the other side according to their preferences. We exploit the structure of stable solutions to devise algorithms that improve the likelihood of finding stable solutions.
arXiv Detail & Related papers (2024-10-06T06:47:53Z)
Improved Bandits in Many-to-one Matching Markets with Incentive Compatibility [12.05174711411919]
Two-sided matching markets have been widely studied in the literature due to their rich applications. We first extend an existing algorithm for the one-to-one setting to this more general setting and show it achieves a near-optimal bound for player-optimal regret. We propose the adaptively explore-then-deferred (AETDA) algorithm for responsiveness setting and derive an upper bound for player-optimal stable regret.
arXiv Detail & Related papers (2024-01-03T03:45:35Z)
A Black-box Approach for Non-stationary Multi-agent Reinforcement Learning [53.83345471268163]
We investigate learning the equilibria in non-stationary multi-agent systems. We show how to test for various types of equilibria by a black-box reduction to single-agent learning.
arXiv Detail & Related papers (2023-06-12T23:48:24Z)
Decentralized Competing Bandits in Non-Stationary Matching Markets [46.13741000158631]
We introduce the framework of decentralized two-sided matching market under non stationary (dynamic) environments. We propose and analyze a decentralized and asynchronous learning algorithm, namely Decentralized Non-stationary Competing Bandits (textttDNCB) We characterize this emphforced exploration and obtain sub-linear (logarithmic) regret of textttDNCB.
arXiv Detail & Related papers (2022-05-31T21:05:30Z)
Learn to Match with No Regret: Reinforcement Learning in Markov Matching Markets [151.03738099494765]
We study a Markov matching market involving a planner and a set of strategic agents on the two sides of the market. We propose a reinforcement learning framework that integrates optimistic value iteration with maximum weight matching. We prove that the algorithm achieves sublinear regret.
arXiv Detail & Related papers (2022-03-07T19:51:25Z)
Learning Equilibria in Matching Markets from Bandit Feedback [139.29934476625488]
We develop a framework and algorithms for learning stable market outcomes under uncertainty. Our work takes a first step toward elucidating when and how stable matchings arise in large, data-driven marketplaces.
arXiv Detail & Related papers (2021-08-19T17:59:28Z)
Bandit Learning in Decentralized Matching Markets [82.39061186055775]
We study two-sided matching markets in which one side of the market (the players) does not have a priori knowledge about its preferences for the other side (the arms) and is required to learn its preferences from experience. This model extends the standard multi-armed bandit framework to a decentralized multiple player setting with competition. We show that the algorithm is incentive compatible whenever the arms' preferences are shared, but not necessarily so when preferences are fully general.
arXiv Detail & Related papers (2020-12-14T08:58:07Z)
Learning Strategies in Decentralized Matching Markets under Uncertain Preferences [91.3755431537592]
We study the problem of decision-making in the setting of a scarcity of shared resources when the preferences of agents are unknown a priori. Our approach is based on the representation of preferences in a reproducing kernel Hilbert space. We derive optimal strategies that maximize agents' expected payoffs.
arXiv Detail & Related papers (2020-10-29T03:08:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.