Learning Fair Policies in Decentralized Cooperative Multi-Agent
Reinforcement Learning
- URL: http://arxiv.org/abs/2012.09421v2
- Date: Mon, 1 Mar 2021 05:32:23 GMT
- Title: Learning Fair Policies in Decentralized Cooperative Multi-Agent
Reinforcement Learning
- Authors: Matthieu Zimmer, Claire Glanois, Umer Siddique, Paul Weng
- Abstract summary: We consider the problem of learning fair policies in (deep) cooperative multi-agent reinforcement learning (MARL)
We propose a novel neural network architecture, which is composed of two sub-networks specifically designed for taking into account the two aspects of fairness.
- Score: 12.215625537879108
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of learning fair policies in (deep) cooperative
multi-agent reinforcement learning (MARL). We formalize it in a principled way
as the problem of optimizing a welfare function that explicitly encodes two
important aspects of fairness: efficiency and equity. As a solution method, we
propose a novel neural network architecture, which is composed of two
sub-networks specifically designed for taking into account the two aspects of
fairness. In experiments, we demonstrate the importance of the two sub-networks
for fair optimization. Our overall approach is general as it can accommodate
any (sub)differentiable welfare function. Therefore, it is compatible with
various notions of fairness that have been proposed in the literature (e.g.,
lexicographic maximin, generalized Gini social welfare function, proportional
fairness). Our solution method is generic and can be implemented in various
MARL settings: centralized training and decentralized execution, or fully
decentralized. Finally, we experimentally validate our approach in various
domains and show that it can perform much better than previous methods.
Related papers
- DECAF: Learning to be Fair in Multi-agent Resource Allocation [4.788163807490197]
We propose methods to learn fair and efficient policies in centralized resource allocation.
Our methods are applied to learning long-term fairness in a novel and general framework for fairness in multi-agent systems.
arXiv Detail & Related papers (2025-02-06T18:29:11Z) - ROMA-iQSS: An Objective Alignment Approach via State-Based Value Learning and ROund-Robin Multi-Agent Scheduling [44.276285521929424]
We introduce a decentralized state-based value learning algorithm that enables agents to independently discover optimal states.
Our theoretical analysis shows that our approach leads decentralized agents to an optimal collective policy.
Empirical experiments further demonstrate that our method outperforms existing decentralized state-based and action-based value learning strategies.
arXiv Detail & Related papers (2024-04-05T09:39:47Z) - MaxMin-RLHF: Alignment with Diverse Human Preferences [101.57443597426374]
Reinforcement Learning from Human Feedback (RLHF) aligns language models to human preferences by employing a singular reward model derived from preference data.
We learn a mixture of preference distributions via an expectation-maximization algorithm to better represent diverse human preferences.
Our algorithm achieves an average improvement of more than 16% in win-rates over conventional RLHF algorithms.
arXiv Detail & Related papers (2024-02-14T03:56:27Z) - Achieving Fairness in Multi-Agent Markov Decision Processes Using
Reinforcement Learning [30.605881670761853]
We propose a Reinforcement Learning approach to achieve fairness in finite-horizon episodic MDPs.
We show that such an approach achieves sub-linear regret in terms of the number of episodes.
arXiv Detail & Related papers (2023-06-01T03:43:53Z) - Expeditious Saliency-guided Mix-up through Random Gradient Thresholding [89.59134648542042]
Mix-up training approaches have proven to be effective in improving the generalization ability of Deep Neural Networks.
In this paper, inspired by the superior qualities of each direction over one another, we introduce a novel method that lies at the junction of the two routes.
We name our method R-Mix following the concept of "Random Mix-up"
In order to address the question of whether there exists a better decision protocol, we train a Reinforcement Learning agent that decides the mix-up policies.
arXiv Detail & Related papers (2022-12-09T14:29:57Z) - FIXED: Frustratingly Easy Domain Generalization with Mixup [53.782029033068675]
Domain generalization (DG) aims to learn a generalizable model from multiple training domains such that it can perform well on unseen target domains.
A popular strategy is to augment training data to benefit generalization through methods such as Mixupcitezhang 2018mixup.
We propose a simple yet effective enhancement for Mixup-based DG, namely domain-invariant Feature mIXup (FIX)
Our approach significantly outperforms nine state-of-the-art related methods, beating the best performing baseline by 6.5% on average in terms of test accuracy.
arXiv Detail & Related papers (2022-11-07T09:38:34Z) - How Robust is Your Fairness? Evaluating and Sustaining Fairness under
Unseen Distribution Shifts [107.72786199113183]
We propose a novel fairness learning method termed CUrvature MAtching (CUMA)
CUMA achieves robust fairness generalizable to unseen domains with unknown distributional shifts.
We evaluate our method on three popular fairness datasets.
arXiv Detail & Related papers (2022-07-04T02:37:50Z) - Revisiting Some Common Practices in Cooperative Multi-Agent
Reinforcement Learning [11.91425153754564]
We show that in environments with a highly multi-modal reward landscape, value decomposition, and parameter sharing can be problematic and lead to undesired outcomes.
In contrast, policy gradient (PG) methods with individual policies provably converge to an optimal solution in these cases.
We present practical suggestions on implementing multi-agent PG algorithms for either high rewards or diverse emergent behaviors.
arXiv Detail & Related papers (2022-06-15T13:03:05Z) - MultiFair: Multi-Group Fairness in Machine Learning [52.24956510371455]
We study multi-group fairness in machine learning (MultiFair)
We propose a generic end-to-end algorithmic framework to solve it.
Our proposed framework is generalizable to many different settings.
arXiv Detail & Related papers (2021-05-24T02:30:22Z) - F2A2: Flexible Fully-decentralized Approximate Actor-critic for
Cooperative Multi-agent Reinforcement Learning [110.35516334788687]
Decentralized multi-agent reinforcement learning algorithms are sometimes unpractical in complicated applications.
We propose a flexible fully decentralized actor-critic MARL framework, which can handle large-scale general cooperative multi-agent setting.
Our framework can achieve scalability and stability for large-scale environment and reduce information transmission.
arXiv Detail & Related papers (2020-04-17T14:56:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.