API: Boosting Multi-Agent Reinforcement Learning via
Agent-Permutation-Invariant Networks
- URL: http://arxiv.org/abs/2203.05285v1
- Date: Thu, 10 Mar 2022 11:00:53 GMT
- Title: API: Boosting Multi-Agent Reinforcement Learning via
Agent-Permutation-Invariant Networks
- Authors: Xiaotian Hao, Weixun Wang, Hangyu Mao, Yaodong Yang, Dong Li, Yan
Zheng, Zhen Wang, Jianye Hao
- Abstract summary: Multi-agent reinforcement learning suffers from poor sample efficiency due to the exponential growth of the state-action space.
We propose two novel designs to achieve permutation invariant (PI)
The first design permutes the same but differently ordered inputs back to the same order and the downstream networks only need to learn function mapping over fixed-ordering inputs.
- Score: 35.63476630248861
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Multi-agent reinforcement learning suffers from poor sample efficiency due to
the exponential growth of the state-action space. Considering a homogeneous
multiagent system, a global state consisting of $m$ homogeneous components has
$m!$ differently ordered representations, thus designing functions satisfying
permutation invariant (PI) can reduce the state space by a factor of
$\frac{1}{m!}$. However, mainstream MARL algorithms ignore this property and
learn over the original state space. To achieve PI, previous works including
data augmentation based methods and embedding-sharing architecture based
methods, suffer from training instability and limited model capacity. In this
work, we propose two novel designs to achieve PI, while avoiding the above
limitations. The first design permutes the same but differently ordered inputs
back to the same order and the downstream networks only need to learn function
mapping over fixed-ordering inputs instead of all permutations, which is much
easier to train. The second design applies a hypernetwork to generate
customized embedding for each component, which has higher representational
capacity than the previous embedding-sharing method. Empirical results on the
SMAC benchmark show that the proposed method achieves 100% win-rates in almost
all hard and super-hard scenarios (never achieved before), and superior
sample-efficiency than the state-of-the-art baselines by up to 400%.
Related papers
- Scalable Multi-agent Covering Option Discovery based on Kronecker Graphs [49.71319907864573]
In this paper, we propose multi-agent skill discovery which enables the ease of decomposition.
Our key idea is to approximate the joint state space as a Kronecker graph, based on which we can directly estimate its Fiedler vector.
Considering that directly computing the Laplacian spectrum is intractable for tasks with infinite-scale state spaces, we further propose a deep learning extension of our method.
arXiv Detail & Related papers (2023-07-21T14:53:12Z) - Cooperative Thresholded Lasso for Sparse Linear Bandit [6.52540785559241]
We present a novel approach to address the multi-agent sparse contextual linear bandit problem.
It is first algorithm that tackles row-wise distributed data in sparse linear bandits.
It is widely applicable to high-dimensional multi-agent problems where efficient feature extraction is critical for minimizing regret.
arXiv Detail & Related papers (2023-05-30T16:05:44Z) - Federated Learning Using Variance Reduced Stochastic Gradient for
Probabilistically Activated Agents [0.0]
This paper proposes an algorithm for Federated Learning (FL) with a two-layer structure that achieves both variance reduction and a faster convergence rate to an optimal solution in the setting where each agent has an arbitrary probability of selection in each iteration.
arXiv Detail & Related papers (2022-10-25T22:04:49Z) - Cluster and Aggregate: Face Recognition with Large Probe Set [18.662943303044315]
We propose a two-stage feature fusion paradigm, Cluster and Aggregate, that can both scale to large $N$ and maintain the ability to perform sequential inference with order invariance.
Experiments on IJB-B and IJB-S benchmark datasets show the superiority of the proposed two-stage paradigm in unconstrained face recognition.
arXiv Detail & Related papers (2022-10-19T20:01:15Z) - Combating Mode Collapse in GANs via Manifold Entropy Estimation [70.06639443446545]
Generative Adversarial Networks (GANs) have shown compelling results in various tasks and applications.
We propose a novel training pipeline to address the mode collapse issue of GANs.
arXiv Detail & Related papers (2022-08-25T12:33:31Z) - Dynamic Prototype Mask for Occluded Person Re-Identification [88.7782299372656]
Existing methods mainly address this issue by employing body clues provided by an extra network to distinguish the visible part.
We propose a novel Dynamic Prototype Mask (DPM) based on two self-evident prior knowledge.
Under this condition, the occluded representation could be well aligned in a selected subspace spontaneously.
arXiv Detail & Related papers (2022-07-19T03:31:13Z) - Exploiting Invariance in Training Deep Neural Networks [4.169130102668252]
Inspired by two basic mechanisms in animal visual systems, we introduce a feature transform technique that imposes invariance properties in the training of deep neural networks.
The resulting algorithm requires less parameter tuning, trains well with an initial learning rate 1.0, and easily generalizes to different tasks.
Tested on ImageNet, MS COCO, and Cityscapes datasets, our proposed technique requires fewer iterations to train, surpasses all baselines by a large margin, seamlessly works on both small and large batch size training, and applies to different computer vision tasks of image classification, object detection, and semantic segmentation.
arXiv Detail & Related papers (2021-03-30T19:18:31Z) - A Fast Graph Neural Network-Based Method for Winner Determination in
Multi-Unit Combinatorial Auctions [44.14410999484577]
Auction (CA) is an efficient mechanism for resource allocation in different fields, including cloud computing.
The problem of allocating items among the bidders to maximize the auctioneers" revenue is NP-complete to solve and inapproximable.
We propose leveraging machine learning (ML) techniques to develop a novel low-complexity algorithm for solving this problem with negligible revenue loss.
arXiv Detail & Related papers (2020-09-29T00:22:37Z) - Improving Robustness and Generality of NLP Models Using Disentangled
Representations [62.08794500431367]
Supervised neural networks first map an input $x$ to a single representation $z$, and then map $z$ to the output label $y$.
We present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning.
We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.
arXiv Detail & Related papers (2020-09-21T02:48:46Z) - Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal
Sample Complexity [67.02490430380415]
We show that model-based MARL achieves a sample complexity of $tilde O(|S||B|(gamma)-3epsilon-2)$ for finding the Nash equilibrium (NE) value up to some $epsilon$ error.
We also show that such a sample bound is minimax-optimal (up to logarithmic factors) if the algorithm is reward-agnostic, where the algorithm queries state transition samples without reward knowledge.
arXiv Detail & Related papers (2020-07-15T03:25:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.