Value Variance Minimization for Learning Approximate Equilibrium in
Aggregation Systems
- URL: http://arxiv.org/abs/2003.07088v1
- Date: Mon, 16 Mar 2020 10:02:42 GMT
- Title: Value Variance Minimization for Learning Approximate Equilibrium in
Aggregation Systems
- Authors: Tanvi Verma, Pradeep Varakantham
- Abstract summary: We consider the problem of learning approximate equilibrium solutions (win-win) in aggregation systems.
In this paper, we consider the problem of learning approximate equilibrium solutions (win-win) in aggregation systems so that individuals have an incentive to remain in the aggregation system.
- Score: 8.140037969280716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For effective matching of resources (e.g., taxis, food, bikes, shopping
items) to customer demand, aggregation systems have been extremely successful.
In aggregation systems, a central entity (e.g., Uber, Food Panda, Ofo)
aggregates supply (e.g., drivers, delivery personnel) and matches demand to
supply on a continuous basis (sequential decisions). Due to the objective of
the central entity to maximize its profits, individual suppliers get sacrificed
thereby creating incentive for individuals to leave the system. In this paper,
we consider the problem of learning approximate equilibrium solutions (win-win
solutions) in aggregation systems, so that individuals have an incentive to
remain in the aggregation system.
Unfortunately, such systems have thousands of agents and have to consider
demand uncertainty and the underlying problem is a (Partially Observable)
Stochastic Game. Given the significant complexity of learning or planning in a
stochastic game, we make three key contributions: (a) To exploit
infinitesimally small contribution of each agent and anonymity (reward and
transitions between agents are dependent on agent counts) in interactions, we
represent this as a Multi-Agent Reinforcement Learning (MARL) problem that
builds on insights from non-atomic congestion games model; (b) We provide a
novel variance reduction mechanism for moving joint solution towards Nash
Equilibrium that exploits the infinitesimally small contribution of each agent;
and finally (c) We provide detailed results on three different domains to
demonstrate the utility of our approach in comparison to state-of-the-art
methods.
Related papers
- Principal-Agent Reinforcement Learning: Orchestrating AI Agents with Contracts [20.8288955218712]
We propose a framework where a principal guides an agent in a Markov Decision Process (MDP) using a series of contracts.
We present and analyze a meta-algorithm that iteratively optimize the policies of the principal and agent.
We then scale our algorithm with deep Q-learning and analyze its convergence in the presence of approximation error.
arXiv Detail & Related papers (2024-07-25T14:28:58Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - Learning Individual Policies in Large Multi-agent Systems through Local
Variance Minimization [8.140037969280716]
In multi-agent systems with large number of agents, contribution of each agent to value of other agents is minimal.
We provide a novel Multi-Agent Reinforcement Learning (MARL) mechanism that minimizes variance across values of agents in the same state.
We show that our approach reduces the variance in revenues earned by taxi drivers, while still providing higher joint revenues than leading approaches.
arXiv Detail & Related papers (2022-12-27T06:59:00Z) - Multi-Agent Neural Rewriter for Vehicle Routing with Limited Disclosure
of Costs [65.23158435596518]
Solving the multi-vehicle routing problem as a team Markov game with partially observable costs.
Our multi-agent reinforcement learning approach, the so-called multi-agent Neural Rewriter, builds on the single-agent Neural Rewriter to solve the problem by iteratively rewriting solutions.
arXiv Detail & Related papers (2022-06-13T09:17:40Z) - Modelling Cournot Games as Multi-agent Multi-armed Bandits [4.751331778201811]
We investigate the use of a multi-agent multi-armed bandit (MA-MAB) setting for modeling repeated Cournot oligopoly games.
We find that an $epsilon$-greedy approach offers a more viable learning mechanism than other traditional MAB approaches.
We propose two novel approaches that take advantage of the ordered action space: $epsilon$-greedy+HL and $epsilon$-greedy+EL.
arXiv Detail & Related papers (2022-01-01T22:02:47Z) - Inducing Equilibria via Incentives: Simultaneous Design-and-Play Finds
Global Optima [114.31577038081026]
We propose an efficient method that tackles the designer's and agents' problems simultaneously in a single loop.
Although the designer does not solve the equilibrium problem repeatedly, it can anticipate the overall influence of the incentives on the agents.
We prove that the algorithm converges to the global optima at a sublinear rate for a broad class of games.
arXiv Detail & Related papers (2021-10-04T06:53:59Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - End-to-End Learning and Intervention in Games [60.41921763076017]
We provide a unified framework for learning and intervention in games.
We propose two approaches, respectively based on explicit and implicit differentiation.
The analytical results are validated using several real-world problems.
arXiv Detail & Related papers (2020-10-26T18:39:32Z) - Calibration of Shared Equilibria in General Sum Partially Observable
Markov Games [15.572157454411533]
We consider a general sum partially observable Markov game where agents of different types share a single policy network.
This paper aims at i) formally understanding equilibria reached by such agents, and ii) matching emergent phenomena of such equilibria to real-world targets.
arXiv Detail & Related papers (2020-06-23T15:14:20Z) - VCG Mechanism Design with Unknown Agent Values under Stochastic Bandit
Feedback [104.06766271716774]
We study a multi-round welfare-maximising mechanism design problem in instances where agents do not know their values.
We first define three notions of regret for the welfare, the individual utilities of each agent and that of the mechanism.
Our framework also provides flexibility to control the pricing scheme so as to trade-off between the agent and seller regrets.
arXiv Detail & Related papers (2020-04-19T18:00:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.