Related papers: Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions

Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions

URL: http://arxiv.org/abs/2007.02382v2
Date: Fri, 14 Aug 2020 05:20:29 GMT
Title: Decentralized Reinforcement Learning: Global Decision-Making via Local Economic Transactions
Authors: Michael Chang, Sidhant Kaushik, S. Matthew Weinberg, Thomas L. Griffiths, Sergey Levine
Abstract summary: We establish a framework for directing a society of simple, specialized, self-interested agents to solve sequential decision problems. We derive a class of decentralized reinforcement learning algorithms. We demonstrate the potential advantages of a society's inherent modular structure for more efficient transfer learning.
Score: 80.49176924360499
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: This paper seeks to establish a framework for directing a society of simple, specialized, self-interested agents to solve what traditionally are posed as monolithic single-agent sequential decision problems. What makes it challenging to use a decentralized approach to collectively optimize a central objective is the difficulty in characterizing the equilibrium strategy profile of non-cooperative games. To overcome this challenge, we design a mechanism for defining the learning environment of each agent for which we know that the optimal solution for the global objective coincides with a Nash equilibrium strategy profile of the agents optimizing their own local objectives. The society functions as an economy of agents that learn the credit assignment process itself by buying and selling to each other the right to operate on the environment state. We derive a class of decentralized reinforcement learning algorithms that are broadly applicable not only to standard reinforcement learning but also for selecting options in semi-MDPs and dynamically composing computation graphs. Lastly, we demonstrate the potential advantages of a society's inherent modular structure for more efficient transfer learning.

Related papers

Federated $\mathcal{X}$-armed Bandit with Flexible Personalisation [3.74142789780782]
This paper introduces a novel approach to personalised federated learning within the $mathcalX$-armed bandit framework. Our method employs a surrogate objective function that combines individual client preferences with aggregated global knowledge, allowing for a flexible trade-off between personalisation and collective learning.
arXiv Detail & Related papers (2024-09-11T13:19:41Z)
Principal-Agent Reinforcement Learning: Orchestrating AI Agents with Contracts [20.8288955218712]
We propose a framework where a principal guides an agent in a Markov Decision Process (MDP) using a series of contracts. We present and analyze a meta-algorithm that iteratively optimize the policies of the principal and agent. We then scale our algorithm with deep Q-learning and analyze its convergence in the presence of approximation error.
arXiv Detail & Related papers (2024-07-25T14:28:58Z)
ROMA-iQSS: An Objective Alignment Approach via State-Based Value Learning and ROund-Robin Multi-Agent Scheduling [44.276285521929424]
We introduce a decentralized state-based value learning algorithm that enables agents to independently discover optimal states. Our theoretical analysis shows that our approach leads decentralized agents to an optimal collective policy. Empirical experiments further demonstrate that our method outperforms existing decentralized state-based and action-based value learning strategies.
arXiv Detail & Related papers (2024-04-05T09:39:47Z)
Personalized Reinforcement Learning with a Budget of Policies [9.846353643883443]
Personalization in machine learning (ML) tailors models' decisions to the individual characteristics of users. We propose a novel framework termed represented Markov Decision Processes (r-MDPs) that is designed to balance the need for personalization with the regulatory constraints. In an r-MDP, we cater to a diverse user population, each with unique preferences, through interaction with a small set of representative policies. We develop two deep reinforcement learning algorithms that efficiently solve r-MDPs.
arXiv Detail & Related papers (2024-01-12T11:27:55Z)
MERMAIDE: Learning to Align Learners using Model-Based Meta-Learning [62.065503126104126]
We study how a principal can efficiently and effectively intervene on the rewards of a previously unseen learning agent in order to induce desirable outcomes. This is relevant to many real-world settings like auctions or taxation, where the principal may not know the learning behavior nor the rewards of real people. We introduce MERMAIDE, a model-based meta-learning framework to train a principal that can quickly adapt to out-of-distribution agents.
arXiv Detail & Related papers (2023-04-10T15:44:50Z)
Finding General Equilibria in Many-Agent Economic Simulations Using Deep Reinforcement Learning [72.23843557783533]
We show that deep reinforcement learning can discover stable solutions that are epsilon-Nash equilibria for a meta-game over agent types. Our approach is more flexible and does not need unrealistic assumptions, e.g., market clearing. We demonstrate our approach in real-business-cycle models, a representative family of DGE models, with 100 worker-consumers, 10 firms, and a government who taxes and redistributes.
arXiv Detail & Related papers (2022-01-03T17:00:17Z)
Decentralized Q-Learning in Zero-sum Markov Games [33.81574774144886]
We study multi-agent reinforcement learning (MARL) in discounted zero-sum Markov games. We develop for the first time a radically uncoupled Q-learning dynamics that is both rational and convergent. The key challenge in this decentralized setting is the non-stationarity of the learning environment from an agent's perspective.
arXiv Detail & Related papers (2021-06-04T22:42:56Z)
Competing Adaptive Networks [56.56653763124104]
We develop an algorithm for decentralized competition among teams of adaptive agents. We present an application in the decentralized training of generative adversarial neural networks.
arXiv Detail & Related papers (2021-03-29T14:42:15Z)
Learning Strategies in Decentralized Matching Markets under Uncertain Preferences [91.3755431537592]
We study the problem of decision-making in the setting of a scarcity of shared resources when the preferences of agents are unknown a priori. Our approach is based on the representation of preferences in a reproducing kernel Hilbert space. We derive optimal strategies that maximize agents' expected payoffs.
arXiv Detail & Related papers (2020-10-29T03:08:22Z)
F2A2: Flexible Fully-decentralized Approximate Actor-critic for Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications. We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting. Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.