Related papers: Hierarchical RNNs-Based Transformers MADDPG for Mixed Cooperative-Competitive Environments

Hierarchical RNNs-Based Transformers MADDPG for Mixed Cooperative-Competitive Environments

URL: http://arxiv.org/abs/2105.04888v1
Date: Tue, 11 May 2021 09:22:52 GMT
Title: Hierarchical RNNs-Based Transformers MADDPG for Mixed Cooperative-Competitive Environments
Authors: Xiaolong Wei, LiFang Yang, Xianglin Huang, Gang Cao, Tao Zhulin, Zhengyang Du, Jing An
Abstract summary: This paper proposed a hierarchical transformers MADDPG based on RNN which we call it Hierarchical RNNs-Based Transformers HRTMADDPG. It consists of a lower level encoder based on RNN that encodes multiple step sizes in each time sequence, and it also consists of an upper sequence level encoder based on transformer for learning the correlations between multiple sequences.
Score: 1.9241821314180374
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: At present, attention mechanism has been widely applied to the fields of deep learning models. Structural models that based on attention mechanism can not only record the relationships between features position, but also can measure the importance of different features based on their weights. By establishing dynamically weighted parameters for choosing relevant and irrelevant features, the key information can be strengthened, and the irrelevant information can be weakened. Therefore, the efficiency of deep learning algorithms can be significantly elevated and improved. Although transformers have been performed very well in many fields including reinforcement learning, there are still many problems and applications can be solved and made with transformers within this area. MARL (known as Multi-Agent Reinforcement Learning) can be recognized as a set of independent agents trying to adapt and learn through their way to reach the goal. In order to emphasize the relationship between each MDP decision in a certain time period, we applied the hierarchical coding method and validated the effectiveness of this method. This paper proposed a hierarchical transformers MADDPG based on RNN which we call it Hierarchical RNNs-Based Transformers MADDPG(HRTMADDPG). It consists of a lower level encoder based on RNNs that encodes multiple step sizes in each time sequence, and it also consists of an upper sequence level encoder based on transformer for learning the correlations between multiple sequences so that we can capture the causal relationship between sub-time sequences and make HRTMADDPG more efficient.

Related papers

Small transformer architectures for task switching [2.7195102129095003]
It is non-trivial to conceive small-scale applications for which attention-based architectures outperform traditional approaches.<n>We show that standard transformers cannot solve a basic task switching reference model.<n>We show that transformers, long short-term memory recurrent networks (LSTM), and plain multi-layer perceptrons (MLPs) achieve similar, but only modest prediction accuracies.
arXiv Detail & Related papers (2025-08-06T14:01:05Z)
Low-latency vision transformers via large-scale multi-head attention [0.0]
A learning mechanism is generalized to large-scale MHA (LS-MHA) using a single matrix value representing single-head performance.<n>Several distinct vision transformer (ViT) architectures achieve the same accuracy but differ in their LS-MHA structures.<n>The extension of this learning mechanism to natural language processing tasks has the potential to yield new insights in deep learning.
arXiv Detail & Related papers (2025-06-30T13:23:46Z)
Learnable Multi-Scale Wavelet Transformer: A Novel Alternative to Self-Attention [0.0]
Learnable Multi-Scale Wavelet Transformer (LMWT) is a novel architecture that replaces the standard dot-product self-attention. We present the detailed mathematical formulation of the learnable Haar wavelet module and its integration into the transformer framework. Our results indicate that the LMWT achieves competitive performance while offering substantial computational advantages.
arXiv Detail & Related papers (2025-04-08T22:16:54Z)
Toward Relative Positional Encoding in Spiking Transformers [52.62008099390541]
Spiking neural networks (SNNs) are bio-inspired networks that model how neurons in the brain communicate through discrete spikes. In this paper, we introduce an approximate method for relative positional encoding (RPE) in Spiking Transformers.
arXiv Detail & Related papers (2025-01-28T06:42:37Z)
Skip-Layer Attention: Bridging Abstract and Detailed Dependencies in Transformers [56.264673865476986]
This paper introduces Skip-Layer Attention (SLA) to enhance Transformer models. SLA improves the model's ability to capture dependencies between high-level abstract features and low-level details. Our implementation extends the Transformer's functionality by enabling queries in a given layer to interact with keys and values from both the current layer and one preceding layer.
arXiv Detail & Related papers (2024-06-17T07:24:38Z)
An Effective-Efficient Approach for Dense Multi-Label Action Detection [23.100602876056165]
It is necessary to simultaneously learn (i) temporal dependencies and (ii) co-occurrence action relationships. Recent approaches model temporal information by extracting multi-scale features through hierarchical transformer-based networks. We argue that combining this with multiple sub-sampling processes in hierarchical designs can lead to further loss of positional information.
arXiv Detail & Related papers (2024-06-10T11:33:34Z)
CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection [1.837431956557716]
Feature pyramids have been widely adopted in convolutional neural networks (CNNs) and transformers for tasks like medical image segmentation and object detection. We propose a novel decoder block that integrates feature pyramids and transformers. Our model achieves superior performance in detecting small objects compared to existing methods.
arXiv Detail & Related papers (2024-04-23T18:46:07Z)
Correlated Attention in Transformers for Multivariate Time Series [22.542109523780333]
We propose a novel correlated attention mechanism, which efficiently captures feature-wise dependencies, and can be seamlessly integrated within the encoder blocks of existing Transformers. In particular, correlated attention operates across feature channels to compute cross-covariance matrices between queries and keys with different lag values, and selectively aggregate representations at the sub-series level. This architecture facilitates automated discovery and representation learning of not only instantaneous but also lagged cross-correlations, while inherently capturing time series auto-correlation.
arXiv Detail & Related papers (2023-11-20T17:35:44Z)
PAT: Position-Aware Transformer for Dense Multi-Label Action Detection [36.39340228621982]
We present PAT, a transformer-based network that learns complex temporal co-occurrence action dependencies in a video. We embed relative positional encoding in the self-attention mechanism and exploit multi-scale temporal relationships. We evaluate the performance of our proposed approach on two challenging dense multi-label benchmark datasets.
arXiv Detail & Related papers (2023-08-09T16:29:31Z)
FormerTime: Hierarchical Multi-Scale Representations for Multivariate Time Series Classification [53.55504611255664]
FormerTime is a hierarchical representation model for improving the classification capacity for the multivariate time series classification task. It exhibits three aspects of merits: (1) learning hierarchical multi-scale representations from time series data, (2) inheriting the strength of both transformers and convolutional networks, and (3) tacking the efficiency challenges incurred by the self-attention mechanism.
arXiv Detail & Related papers (2023-02-20T07:46:14Z)
Rich CNN-Transformer Feature Aggregation Networks for Super-Resolution [50.10987776141901]
Recent vision transformers along with self-attention have achieved promising results on various computer vision tasks. We introduce an effective hybrid architecture for super-resolution (SR) tasks, which leverages local features from CNNs and long-range dependencies captured by transformers. Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.
arXiv Detail & Related papers (2022-03-15T06:52:25Z)
Hierarchical Multimodal Transformer to Summarize Videos [103.47766795086206]
Motivated by the great success of transformer and the natural structure of video (frame-shot-video), a hierarchical transformer is developed for video summarization. To integrate the two kinds of information, they are encoded in a two-stream scheme, and a multimodal fusion mechanism is developed based on the hierarchical transformer. Practically, extensive experiments show that HMT surpasses most of the traditional, RNN-based and attention-based video summarization methods.
arXiv Detail & Related papers (2021-09-22T07:38:59Z)
Regularizing Transformers With Deep Probabilistic Layers [62.997667081978825]
In this work, we demonstrate how the inclusion of deep generative models within BERT can bring more versatile models. We prove its effectiveness not only in Transformers but also in the most relevant encoder-decoder based LM, seq2seq with and without attention.
arXiv Detail & Related papers (2021-08-23T10:17:02Z)
Less is More: Pay Less Attention in Vision Transformers [61.05787583247392]
Less attention vIsion Transformer builds upon the fact that convolutions, fully-connected layers, and self-attentions have almost equivalent mathematical expressions for processing image patch sequences. The proposed LIT achieves promising performance on image recognition tasks, including image classification, object detection and instance segmentation.
arXiv Detail & Related papers (2021-05-29T05:26:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.