Linearized Relative Positional Encoding
- URL: http://arxiv.org/abs/2307.09270v1
- Date: Tue, 18 Jul 2023 13:56:43 GMT
- Title: Linearized Relative Positional Encoding
- Authors: Zhen Qin, Weixuan Sun, Kaiyue Lu, Hui Deng, Dongxu Li, Xiaodong Han,
Yuchao Dai, Lingpeng Kong, Yiran Zhong
- Abstract summary: Relative positional encoding is widely used in vanilla and linear transformers to represent positional information.
We put together a variety of existing linear relative positional encoding approaches under a canonical form.
We further propose a family of linear relative positional encoding algorithms via unitary transformation.
- Score: 43.898057545832366
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Relative positional encoding is widely used in vanilla and linear
transformers to represent positional information. However, existing encoding
methods of a vanilla transformer are not always directly applicable to a linear
transformer, because the latter requires a decomposition of the query and key
representations into separate kernel functions. Nevertheless, principles for
designing encoding methods suitable for linear transformers remain
understudied. In this work, we put together a variety of existing linear
relative positional encoding approaches under a canonical form and further
propose a family of linear relative positional encoding algorithms via unitary
transformation. Our formulation leads to a principled framework that can be
used to develop new relative positional encoding methods that preserve linear
space-time complexity. Equipped with different models, the proposed linearized
relative positional encoding (LRPE) family derives effective encoding for
various applications. Experiments show that compared with existing methods,
LRPE achieves state-of-the-art performance in language modeling, text
classification, and image classification. Meanwhile, it emphasizes a general
paradigm for designing broadly more relative positional encoding methods that
are applicable to linear transformers. The code is available at
https://github.com/OpenNLPLab/Lrpe.
Related papers
- Improving Transformers using Faithful Positional Encoding [55.30212768657544]
We propose a new positional encoding method for a neural network architecture called the Transformer.
Unlike the standard sinusoidal positional encoding, our approach has a guarantee of not losing information about the positional order of the input sequence.
arXiv Detail & Related papers (2024-05-15T03:17:30Z) - PoPE: Legendre Orthogonal Polynomials Based Position Encoding for Large Language Models [0.0]
Polynomial Based Positional gonal (PoPE) encodes positional information by Orthogonal Legendres.
We show that transformer models PoPE outperform baseline transformer models on the $Multi30k$ English-to-German translation task.
We will present novel theoretical perspectives on position encoding based on the superior performance of PoPE.
arXiv Detail & Related papers (2024-04-29T10:30:59Z) - Improving Position Encoding of Transformers for Multivariate Time Series
Classification [5.467400475482668]
We propose a new absolute position encoding method dedicated to time series data called time Absolute Position.
We then propose a novel time series classification (MTSC) model combining tAPE/eRPE and convolution-based input encoding named ConvTran to improve the position and data embedding of time series data.
arXiv Detail & Related papers (2023-05-26T05:30:04Z) - Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers [71.32827362323205]
We propose a new class of linear Transformers calledLearner-Transformers (Learners)
They incorporate a wide range of relative positional encoding mechanisms (RPEs)
These include regular RPE techniques applied for sequential data, as well as novel RPEs operating on geometric data embedded in higher-dimensional Euclidean spaces.
arXiv Detail & Related papers (2023-02-03T18:57:17Z) - Error Correction Code Transformer [92.10654749898927]
We propose to extend for the first time the Transformer architecture to the soft decoding of linear codes at arbitrary block lengths.
We encode each channel's output dimension to high dimension for better representation of the bits information to be processed separately.
The proposed approach demonstrates the extreme power and flexibility of Transformers and outperforms existing state-of-the-art neural decoders by large margins at a fraction of their time complexity.
arXiv Detail & Related papers (2022-03-27T15:25:58Z) - Learnable Fourier Features for Multi-DimensionalSpatial Positional
Encoding [96.9752763607738]
We propose a novel positional encoding method based on learnable Fourier features.
Our experiments show that our learnable feature representation for multi-dimensional positional encoding outperforms existing methods.
arXiv Detail & Related papers (2021-06-05T04:40:18Z) - Relative Positional Encoding for Transformers with Linear Complexity [30.48367640796256]
relative positional encoding (RPE) was proposed as beneficial for classical Transformers.
RPE is not available for the recent linear-variants of the Transformer, because it requires the explicit computation of the attention matrix.
In this paper, we present precisely what is precisely what is a way to generate PE that can be used as a replacement to the classical additive (sinusoidal) PE and provably behaves like RPE.
arXiv Detail & Related papers (2021-05-18T09:52:32Z) - Demystifying the Better Performance of Position Encoding Variants for
Transformer [12.503079503907989]
We show how to encode position and segment into Transformer models.
The proposed method performs on par with SOTA on GLUE, XTREME and WMT benchmarks while saving costs.
arXiv Detail & Related papers (2021-04-18T03:44:57Z) - Learning to Encode Position for Transformer with Continuous Dynamical
Model [88.69870971415591]
We introduce a new way of learning to encode position information for non-recurrent models, such as Transformer models.
We model the evolution of encoded results along position index by such a dynamical system.
arXiv Detail & Related papers (2020-03-13T00:41:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.