Decentralized Reinforcement Learning: Global Decision-Making via Local
Economic Transactions
- URL: http://arxiv.org/abs/2007.02382v2
- Date: Fri, 14 Aug 2020 05:20:29 GMT
- Title: Decentralized Reinforcement Learning: Global Decision-Making via Local
Economic Transactions
- Authors: Michael Chang, Sidhant Kaushik, S. Matthew Weinberg, Thomas L.
Griffiths, Sergey Levine
- Abstract summary: We establish a framework for directing a society of simple, specialized, self-interested agents to solve sequential decision problems.
We derive a class of decentralized reinforcement learning algorithms.
We demonstrate the potential advantages of a society's inherent modular structure for more efficient transfer learning.
- Score: 80.49176924360499
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper seeks to establish a framework for directing a society of simple,
specialized, self-interested agents to solve what traditionally are posed as
monolithic single-agent sequential decision problems. What makes it challenging
to use a decentralized approach to collectively optimize a central objective is
the difficulty in characterizing the equilibrium strategy profile of
non-cooperative games. To overcome this challenge, we design a mechanism for
defining the learning environment of each agent for which we know that the
optimal solution for the global objective coincides with a Nash equilibrium
strategy profile of the agents optimizing their own local objectives. The
society functions as an economy of agents that learn the credit assignment
process itself by buying and selling to each other the right to operate on the
environment state. We derive a class of decentralized reinforcement learning
algorithms that are broadly applicable not only to standard reinforcement
learning but also for selecting options in semi-MDPs and dynamically composing
computation graphs. Lastly, we demonstrate the potential advantages of a
society's inherent modular structure for more efficient transfer learning.
Related papers
- Federated $\mathcal{X}$-armed Bandit with Flexible Personalisation [3.74142789780782]
This paper introduces a novel approach to personalised federated learning within the $mathcalX$-armed bandit framework.
Our method employs a surrogate objective function that combines individual client preferences with aggregated global knowledge, allowing for a flexible trade-off between personalisation and collective learning.
arXiv Detail & Related papers (2024-09-11T13:19:41Z) - Principal-Agent Reinforcement Learning: Orchestrating AI Agents with Contracts [20.8288955218712]
We propose a framework where a principal guides an agent in a Markov Decision Process (MDP) using a series of contracts.
We present and analyze a meta-algorithm that iteratively optimize the policies of the principal and agent.
We then scale our algorithm with deep Q-learning and analyze its convergence in the presence of approximation error.
arXiv Detail & Related papers (2024-07-25T14:28:58Z) - ROMA-iQSS: An Objective Alignment Approach via State-Based Value Learning and ROund-Robin Multi-Agent Scheduling [44.276285521929424]
We introduce a decentralized state-based value learning algorithm that enables agents to independently discover optimal states.
Our theoretical analysis shows that our approach leads decentralized agents to an optimal collective policy.
Empirical experiments further demonstrate that our method outperforms existing decentralized state-based and action-based value learning strategies.
arXiv Detail & Related papers (2024-04-05T09:39:47Z) - Personalized Reinforcement Learning with a Budget of Policies [9.846353643883443]
Personalization in machine learning (ML) tailors models' decisions to the individual characteristics of users.
We propose a novel framework termed represented Markov Decision Processes (r-MDPs) that is designed to balance the need for personalization with the regulatory constraints.
In an r-MDP, we cater to a diverse user population, each with unique preferences, through interaction with a small set of representative policies.
We develop two deep reinforcement learning algorithms that efficiently solve r-MDPs.
arXiv Detail & Related papers (2024-01-12T11:27:55Z) - MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning [62.065503126104126]
We study how a principal can efficiently and effectively intervene on the rewards of a previously unseen learning agent in order to induce desirable outcomes.
This is relevant to many real-world settings like auctions or taxation, where the principal may not know the learning behavior nor the rewards of real people.
We introduce MERMAIDE, a model-based meta-learning framework to train a principal that can quickly adapt to out-of-distribution agents.
arXiv Detail & Related papers (2023-04-10T15:44:50Z) - Finding General Equilibria in Many-Agent Economic Simulations Using Deep
Reinforcement Learning [72.23843557783533]
We show that deep reinforcement learning can discover stable solutions that are epsilon-Nash equilibria for a meta-game over agent types.
Our approach is more flexible and does not need unrealistic assumptions, e.g., market clearing.
We demonstrate our approach in real-business-cycle models, a representative family of DGE models, with 100 worker-consumers, 10 firms, and a government who taxes and redistributes.
arXiv Detail & Related papers (2022-01-03T17:00:17Z) - Decentralized Q-Learning in Zero-sum Markov Games [33.81574774144886]
We study multi-agent reinforcement learning (MARL) in discounted zero-sum Markov games.
We develop for the first time a radically uncoupled Q-learning dynamics that is both rational and convergent.
The key challenge in this decentralized setting is the non-stationarity of the learning environment from an agent's perspective.
arXiv Detail & Related papers (2021-06-04T22:42:56Z) - Competing Adaptive Networks [56.56653763124104]
We develop an algorithm for decentralized competition among teams of adaptive agents.
We present an application in the decentralized training of generative adversarial neural networks.
arXiv Detail & Related papers (2021-03-29T14:42:15Z) - Learning Strategies in Decentralized Matching Markets under Uncertain
Preferences [91.3755431537592]
We study the problem of decision-making in the setting of a scarcity of shared resources when the preferences of agents are unknown a priori.
Our approach is based on the representation of preferences in a reproducing kernel Hilbert space.
We derive optimal strategies that maximize agents' expected payoffs.
arXiv Detail & Related papers (2020-10-29T03:08:22Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.