S2RL: Do We Really Need to Perceive All States in Deep Multi-Agent
Reinforcement Learning?
- URL: http://arxiv.org/abs/2206.11054v1
- Date: Mon, 20 Jun 2022 07:33:40 GMT
- Title: S2RL: Do We Really Need to Perceive All States in Deep Multi-Agent
Reinforcement Learning?
- Authors: Shuang Luo, Yinchuan Li, Jiahui Li, Kun Kuang, Furui Liu, Yunfeng
Shao, Chao Wu
- Abstract summary: Collaborative multi-agent reinforcement learning (MARL) has been widely used in many practical applications.
We propose a sparse state based MARL framework, which utilizes a sparse attention mechanism to discard irrelevant information in local observations.
- Score: 26.265100805551764
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Collaborative multi-agent reinforcement learning (MARL) has been widely used
in many practical applications, where each agent makes a decision based on its
own observation. Most mainstream methods treat each local observation as an
entirety when modeling the decentralized local utility functions. However, they
ignore the fact that local observation information can be further divided into
several entities, and only part of the entities is helpful to model inference.
Moreover, the importance of different entities may change over time. To improve
the performance of decentralized policies, the attention mechanism is used to
capture features of local information. Nevertheless, existing attention models
rely on dense fully connected graphs and cannot better perceive important
states. To this end, we propose a sparse state based MARL (S2RL) framework,
which utilizes a sparse attention mechanism to discard irrelevant information
in local observations. The local utility functions are estimated through the
self-attention and sparse attention mechanisms separately, then are combined
into a standard joint value function and auxiliary joint value function in the
central critic. We design the S2RL framework as a plug-and-play module, making
it general enough to be applied to various methods. Extensive experiments on
StarCraft II show that S2RL can significantly improve the performance of many
state-of-the-art methods.
Related papers
- Beyond Local Views: Global State Inference with Diffusion Models for Cooperative Multi-Agent Reinforcement Learning [36.25611963252774]
State Inference with Diffusion Models (SIDIFF) is inspired by image outpainting.
SIDIFF reconstructs the original global state based solely on local observations.
It can be effortlessly incorporated into current multi-agent reinforcement learning algorithms.
arXiv Detail & Related papers (2024-08-18T14:49:53Z) - ECEA: Extensible Co-Existing Attention for Few-Shot Object Detection [52.16237548064387]
Few-shot object detection (FSOD) identifies objects from extremely few annotated samples.
Most existing FSOD methods, recently, apply the two-stage learning paradigm, which transfers the knowledge learned from abundant base classes to assist the few-shot detectors by learning the global features.
We propose an Extensible Co-Existing Attention (ECEA) module to enable the model to infer the global object according to the local parts.
arXiv Detail & Related papers (2023-09-15T06:55:43Z) - Global Meets Local: Effective Multi-Label Image Classification via
Category-Aware Weak Supervision [37.761378069277676]
This paper builds a unified framework to perform effective noisy-proposal suppression.
We develop a cross-granularity attention module to explore the complementary information between global and local features.
Our framework achieves superior performance over state-of-the-art methods.
arXiv Detail & Related papers (2022-11-23T05:39:17Z) - Centralized Feature Pyramid for Object Detection [53.501796194901964]
Visual feature pyramid has shown its superiority in both effectiveness and efficiency in a wide range of applications.
In this paper, we propose a OLO Feature Pyramid for object detection, which is based on a globally explicit centralized feature regulation.
arXiv Detail & Related papers (2022-10-05T08:32:54Z) - Federated and Generalized Person Re-identification through Domain and
Feature Hallucinating [88.77196261300699]
We study the problem of federated domain generalization (FedDG) for person re-identification (re-ID)
We propose a novel method, called "Domain and Feature Hallucinating (DFH)", to produce diverse features for learning generalized local and global models.
Our method achieves the state-of-the-art performance for FedDG on four large-scale re-ID benchmarks.
arXiv Detail & Related papers (2022-03-05T09:15:13Z) - Local2Global: A distributed approach for scaling representation learning
on graphs [10.254620252788776]
We propose a decentralised "local2global"' approach to graph representation learning, that one can a-priori use to scale any embedding technique.
We show that our approach achieves a good trade-off between scale and accuracy on edge reconstruction and semi-supervised classification.
We also consider the downstream task of anomaly detection and show how one can use local2global to highlight anomalies in cybersecurity networks.
arXiv Detail & Related papers (2022-01-12T23:00:22Z) - Feature-Attending Recurrent Modules for Generalization in Reinforcement
Learning [27.736730414205137]
"Feature- Recurrent Modules" (FARM) is an architecture for learning state representations that relies on simple, broadly applicable inductive biases for spatial and temporal regularities.
FARM learns a state representation that is distributed across multiple modules that each attend to capturing features with an expressive feature attention mechanism.
We show that this improves an RL agents ability to generalize across object-centric tasks.
arXiv Detail & Related papers (2021-12-15T12:48:12Z) - Decentralised Person Re-Identification with Selective Knowledge
Aggregation [56.40855978874077]
Existing person re-identification (Re-ID) methods mostly follow a centralised learning paradigm which shares all training data to a collection for model learning.
Two recent works have introduced decentralised (federated) Re-ID learning for constructing a globally generalised model (server)
However, these methods are poor on how to adapt the generalised model to maximise its performance on individual client domain Re-ID tasks.
We present a new Selective Knowledge Aggregation approach to decentralised person Re-ID to optimise the trade-off between model personalisation and generalisation.
arXiv Detail & Related papers (2021-10-21T18:09:53Z) - Locality Matters: A Scalable Value Decomposition Approach for
Cooperative Multi-Agent Reinforcement Learning [52.7873574425376]
Cooperative multi-agent reinforcement learning (MARL) faces significant scalability issues due to state and action spaces that are exponentially large in the number of agents.
We propose a novel, value-based multi-agent algorithm called LOMAQ, which incorporates local rewards in the Training Decentralized Execution paradigm.
arXiv Detail & Related papers (2021-09-22T10:08:15Z) - Cross-modal Consensus Network for Weakly Supervised Temporal Action
Localization [74.34699679568818]
Weakly supervised temporal action localization (WS-TAL) is a challenging task that aims to localize action instances in the given video with video-level categorical supervision.
We propose a cross-modal consensus network (CO2-Net) to tackle this problem.
arXiv Detail & Related papers (2021-07-27T04:21:01Z) - Region-based Non-local Operation for Video Classification [11.746833714322154]
This paper presents region-based non-local (RNL) operations as a family of self-attention mechanisms.
By combining a channel attention module with the proposed RNL, we design an attention chain, which can be integrated into the off-the-shelf CNNs for end-to-end training.
The experimental results of our method outperform other attention mechanisms, and we achieve state-of-the-art performance on the Something-Something V1 dataset.
arXiv Detail & Related papers (2020-07-17T14:57:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.