Hierarchical RNNs-Based Transformers MADDPG for Mixed
Cooperative-Competitive Environments
- URL: http://arxiv.org/abs/2105.04888v1
- Date: Tue, 11 May 2021 09:22:52 GMT
- Title: Hierarchical RNNs-Based Transformers MADDPG for Mixed
Cooperative-Competitive Environments
- Authors: Xiaolong Wei, LiFang Yang, Xianglin Huang, Gang Cao, Tao Zhulin,
Zhengyang Du, Jing An
- Abstract summary: This paper proposed a hierarchical transformers MADDPG based on RNN which we call it Hierarchical RNNs-Based Transformers HRTMADDPG.
It consists of a lower level encoder based on RNN that encodes multiple step sizes in each time sequence, and it also consists of an upper sequence level encoder based on transformer for learning the correlations between multiple sequences.
- Score: 1.9241821314180374
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: At present, attention mechanism has been widely applied to the fields of deep
learning models. Structural models that based on attention mechanism can not
only record the relationships between features position, but also can measure
the importance of different features based on their weights. By establishing
dynamically weighted parameters for choosing relevant and irrelevant features,
the key information can be strengthened, and the irrelevant information can be
weakened. Therefore, the efficiency of deep learning algorithms can be
significantly elevated and improved. Although transformers have been performed
very well in many fields including reinforcement learning, there are still many
problems and applications can be solved and made with transformers within this
area. MARL (known as Multi-Agent Reinforcement Learning) can be recognized as a
set of independent agents trying to adapt and learn through their way to reach
the goal. In order to emphasize the relationship between each MDP decision in a
certain time period, we applied the hierarchical coding method and validated
the effectiveness of this method. This paper proposed a hierarchical
transformers MADDPG based on RNN which we call it Hierarchical RNNs-Based
Transformers MADDPG(HRTMADDPG). It consists of a lower level encoder based on
RNNs that encodes multiple step sizes in each time sequence, and it also
consists of an upper sequence level encoder based on transformer for learning
the correlations between multiple sequences so that we can capture the causal
relationship between sub-time sequences and make HRTMADDPG more efficient.
Related papers
- Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers [56.264673865476986]
This paper introduces Skip-Layer Attention (SLA) to enhance Transformer models.
SLA improves the model's ability to capture dependencies between high-level abstract features and low-level details.
Our implementation extends the Transformer's functionality by enabling queries in a given layer to interact with keys and values from both the current layer and one preceding layer.
arXiv Detail & Related papers (2024-06-17T07:24:38Z) - An Effective-Efficient Approach for Dense Multi-Label Action Detection [23.100602876056165]
It is necessary to simultaneously learn (i) temporal dependencies and (ii) co-occurrence action relationships.
Recent approaches model temporal information by extracting multi-scale features through hierarchical transformer-based networks.
We argue that combining this with multiple sub-sampling processes in hierarchical designs can lead to further loss of positional information.
arXiv Detail & Related papers (2024-06-10T11:33:34Z) - CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection [1.837431956557716]
Feature pyramids have been widely adopted in convolutional neural networks (CNNs) and transformers for tasks like medical image segmentation and object detection.
We propose a novel decoder block that integrates feature pyramids and transformers.
Our model achieves superior performance in detecting small objects compared to existing methods.
arXiv Detail & Related papers (2024-04-23T18:46:07Z) - Correlated Attention in Transformers for Multivariate Time Series [22.542109523780333]
We propose a novel correlated attention mechanism, which efficiently captures feature-wise dependencies, and can be seamlessly integrated within the encoder blocks of existing Transformers.
In particular, correlated attention operates across feature channels to compute cross-covariance matrices between queries and keys with different lag values, and selectively aggregate representations at the sub-series level.
This architecture facilitates automated discovery and representation learning of not only instantaneous but also lagged cross-correlations, while inherently capturing time series auto-correlation.
arXiv Detail & Related papers (2023-11-20T17:35:44Z) - PAT: Position-Aware Transformer for Dense Multi-Label Action Detection [36.39340228621982]
We present PAT, a transformer-based network that learns complex temporal co-occurrence action dependencies in a video.
We embed relative positional encoding in the self-attention mechanism and exploit multi-scale temporal relationships.
We evaluate the performance of our proposed approach on two challenging dense multi-label benchmark datasets.
arXiv Detail & Related papers (2023-08-09T16:29:31Z) - FormerTime: Hierarchical Multi-Scale Representations for Multivariate
Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task.
It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z) - Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks.
We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers.
Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z) - Hierarchical Multimodal Transformer to Summarize Videos [103.47766795086206]
Motivated by the great success of transformer and the natural structure of video (frame-shot-video), a hierarchical transformer is developed for video summarization.
To integrate the two kinds of information, they are encoded in a two-stream scheme, and a multimodal fusion mechanism is developed based on the hierarchical transformer.
Practically, extensive experiments show that HMT surpasses most of the traditional, RNN-based and attention-based video summarization methods.
arXiv Detail & Related papers (2021-09-22T07:38:59Z) - Regularizing Transformers With Deep Probabilistic Layers [62.997667081978825]
In this work, we demonstrate how the inclusion of deep generative models within BERT can bring more versatile models.
We prove its effectiveness not only in Transformers but also in the most relevant encoder-decoder based LM, seq2seq with and without attention.
arXiv Detail & Related papers (2021-08-23T10:17:02Z) - Less is More: Pay Less Attention in Vision Transformers [61.05787583247392]
Less attention vIsion Transformer builds upon the fact that convolutions, fully-connected layers, and self-attentions have almost equivalent mathematical expressions for processing image patch sequences.
The proposed LIT achieves promising performance on image recognition tasks, including image classification, object detection and instance segmentation.
arXiv Detail & Related papers (2021-05-29T05:26:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.