Learning Individual Policies in Large Multi-agent Systems through Local
Variance Minimization
- URL: http://arxiv.org/abs/2212.13379v1
- Date: Tue, 27 Dec 2022 06:59:00 GMT
- Title: Learning Individual Policies in Large Multi-agent Systems through Local
Variance Minimization
- Authors: Tanvi Verma, Pradeep Varakantham
- Abstract summary: In multi-agent systems with large number of agents, contribution of each agent to value of other agents is minimal.
We provide a novel Multi-Agent Reinforcement Learning (MARL) mechanism that minimizes variance across values of agents in the same state.
We show that our approach reduces the variance in revenues earned by taxi drivers, while still providing higher joint revenues than leading approaches.
- Score: 8.140037969280716
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In multi-agent systems with large number of agents, typically the
contribution of each agent to the value of other agents is minimal (e.g.,
aggregation systems such as Uber, Deliveroo). In this paper, we consider such
multi-agent systems where each agent is self-interested and takes a sequence of
decisions and represent them as a Stochastic Non-atomic Congestion Game (SNCG).
We derive key properties for equilibrium solutions in SNCG model with
non-atomic and also nearly non-atomic agents. With those key equilibrium
properties, we provide a novel Multi-Agent Reinforcement Learning (MARL)
mechanism that minimizes variance across values of agents in the same state. To
demonstrate the utility of this new mechanism, we provide detailed results on a
real-world taxi dataset and also a generic simulator for aggregation systems.
We show that our approach reduces the variance in revenues earned by taxi
drivers, while still providing higher joint revenues than leading approaches.
Related papers
- Enhancing Heterogeneous Multi-Agent Cooperation in Decentralized MARL via GNN-driven Intrinsic Rewards [1.179778723980276]
Multi-agent Reinforcement Learning (MARL) is emerging as a key framework for sequential decision-making and control tasks.
The deployment of these systems in real-world scenarios often requires decentralized training, a diverse set of agents, and learning from infrequent environmental reward signals.
We propose the CoHet algorithm, which utilizes a novel Graph Neural Network (GNN) based intrinsic motivation to facilitate the learning of heterogeneous agent policies.
arXiv Detail & Related papers (2024-08-12T21:38:40Z) - On the Complexity of Multi-Agent Decision Making: From Learning in Games
to Partial Monitoring [105.13668993076801]
A central problem in the theory of multi-agent reinforcement learning (MARL) is to understand what structural conditions and algorithmic principles lead to sample-efficient learning guarantees.
We study this question in a general framework for interactive decision making with multiple agents.
We show that characterizing the statistical complexity for multi-agent decision making is equivalent to characterizing the statistical complexity of single-agent decision making.
arXiv Detail & Related papers (2023-05-01T06:46:22Z) - Decentralized scheduling through an adaptive, trading-based multi-agent
system [1.7403133838762448]
In multi-agent reinforcement learning systems, the actions of one agent can have a negative impact on the rewards of other agents.
This work applies a trading approach to a simulated scheduling environment, where the agents are responsible for the assignment of incoming jobs to compute cores.
The agents can trade the usage right of computational cores to process high-priority, high-reward jobs faster than low-priority, low-reward jobs.
arXiv Detail & Related papers (2022-07-05T13:50:18Z) - Multi-Agent MDP Homomorphic Networks [100.74260120972863]
In cooperative multi-agent systems, complex symmetries arise between different configurations of the agents and their local observations.
Existing work on symmetries in single agent reinforcement learning can only be generalized to the fully centralized setting.
This paper introduces Multi-Agent MDP Homomorphic Networks, a class of networks that allows distributed execution using only local information.
arXiv Detail & Related papers (2021-10-09T07:46:25Z) - Multi-Agent Imitation Learning with Copulas [102.27052968901894]
Multi-agent imitation learning aims to train multiple agents to perform tasks from demonstrations by learning a mapping between observations and actions.
In this paper, we propose to use copula, a powerful statistical tool for capturing dependence among random variables, to explicitly model the correlation and coordination in multi-agent systems.
Our proposed model is able to separately learn marginals that capture the local behavioral patterns of each individual agent, as well as a copula function that solely and fully captures the dependence structure among agents.
arXiv Detail & Related papers (2021-07-10T03:49:41Z) - ERMAS: Becoming Robust to Reward Function Sim-to-Real Gaps in
Multi-Agent Simulations [110.72725220033983]
Epsilon-Robust Multi-Agent Simulation (ERMAS) is a framework for learning AI policies that are robust to such multiagent sim-to-real gaps.
ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complextemporal simulations.
In particular, ERMAS learns tax policies that are robust to changes in agent risk aversion, improving social welfare by up to 15% in complextemporal simulations.
arXiv Detail & Related papers (2021-06-10T04:32:20Z) - Is Independent Learning All You Need in the StarCraft Multi-Agent
Challenge? [100.48692829396778]
Independent PPO (IPPO) is a form of independent learning in which each agent simply estimates its local value function.
IPPO's strong performance may be due to its robustness to some forms of environment non-stationarity.
arXiv Detail & Related papers (2020-11-18T20:29:59Z) - Calibration of Shared Equilibria in General Sum Partially Observable
Markov Games [15.572157454411533]
We consider a general sum partially observable Markov game where agents of different types share a single policy network.
This paper aims at i) formally understanding equilibria reached by such agents, and ii) matching emergent phenomena of such equilibria to real-world targets.
arXiv Detail & Related papers (2020-06-23T15:14:20Z) - Distributed Reinforcement Learning for Cooperative Multi-Robot Object
Manipulation [53.262360083572005]
We consider solving a cooperative multi-robot object manipulation task using reinforcement learning (RL)
We propose two distributed multi-agent RL approaches: distributed approximate RL (DA-RL) and game-theoretic RL (GT-RL)
Although we focus on a small system of two agents in this paper, both DA-RL and GT-RL apply to general multi-agent systems, and are expected to scale well to large systems.
arXiv Detail & Related papers (2020-03-21T00:43:54Z) - Value Variance Minimization for Learning Approximate Equilibrium in
Aggregation Systems [8.140037969280716]
We consider the problem of learning approximate equilibrium solutions (win-win) in aggregation systems.
In this paper, we consider the problem of learning approximate equilibrium solutions (win-win) in aggregation systems so that individuals have an incentive to remain in the aggregation system.
arXiv Detail & Related papers (2020-03-16T10:02:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.