CoPE: A Lightweight Complex Positional Encoding
- URL: http://arxiv.org/abs/2508.18308v1
- Date: Sat, 23 Aug 2025 08:02:07 GMT
- Title: CoPE: A Lightweight Complex Positional Encoding
- Authors: Avinash Amballa,
- Abstract summary: We introduce CoPE, a novel architecture that leverages complex-valued encoding to encode both content and positional information.<n>Our approach replaces traditional positional encodings with complex embeddings where the real part captures semantic content and the imaginary part encodes positional information.<n>We show that CoPE doesn't exhibit long term decay and is compatible with linear attention.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Recent studies have demonstrated the effectiveness of position encoding in transformer architectures. By incorporating positional information, this approach provides essential guidance for modeling dependencies between elements across different sequence positions. We introduce CoPE (a lightweight Complex Positional Encoding), a novel architecture that leverages complex-valued encoding to encode both content and positional information. Our approach replaces traditional positional encodings with complex embeddings where the real part captures semantic content and the imaginary part encodes positional information. We introduce phase-aware attention in the first layer of the transformer model to capture position-dependent patterns, followed by standard attention layers for higher-levels. We show that CoPE doesn't exhibit long term decay and is compatible with linear attention. Experimental evaluation on the GLUE benchmark suggest that our approach achieves superior performance with less computational complexity, compared to RoPE, Sinusoidal and Learned positional encodings.
Related papers
- SeqPE: Transformer with Sequential Position Encoding [76.22159277300891]
SeqPE represents each $n$-dimensional position index as a symbolic sequence and employs a lightweight sequential position encoder to learn their embeddings.<n> Experiments across language modeling, long-context question answering, and 2D image classification demonstrate that SeqPE not only surpasses strong baselines in perplexity, exact match (EM) and accuracy--but also enables seamless generalization to multi-dimensional inputs without requiring manual architectural redesign.
arXiv Detail & Related papers (2025-06-16T09:16:40Z) - Learnable Spatial-Temporal Positional Encoding for Link Prediction [44.0907827498725]
We propose a simple temporal link prediction model named L-STEP.<n>L-STEP can preserve the graph property from the spatial-temporal spectral viewpoint.<n>L-STEP obtains the leading performance in the newest large-scale TGB benchmark.
arXiv Detail & Related papers (2025-06-10T00:35:53Z) - PaTH Attention: Position Encoding via Accumulating Householder Transformations [56.32365080761523]
PaTH is a flexible data-dependent position encoding scheme based on accumulated products of Householder transformations.<n>We derive an efficient parallel algorithm for training through exploiting a compact representation of products of Householder matrices.
arXiv Detail & Related papers (2025-05-22T08:36:09Z) - Improving Transformers using Faithful Positional Encoding [55.30212768657544]
We propose a new positional encoding method for a neural network architecture called the Transformer.
Unlike the standard sinusoidal positional encoding, our approach has a guarantee of not losing information about the positional order of the input sequence.
arXiv Detail & Related papers (2024-05-15T03:17:30Z) - PoPE: Legendre Orthogonal Polynomials Based Position Encoding for Large Language Models [0.0]
Polynomial Based Positional gonal (PoPE) encodes positional information by Orthogonal Legendres.
We show that transformer models PoPE outperform baseline transformer models on the $Multi30k$ English-to-German translation task.
We will present novel theoretical perspectives on position encoding based on the superior performance of PoPE.
arXiv Detail & Related papers (2024-04-29T10:30:59Z) - Two Stones Hit One Bird: Bilevel Positional Encoding for Better Length Extrapolation [69.68831888599476]
We develop a new positional encoding method called Bilevel Positional.
Ethicical analysis shows this disentanglement of positional information makes learning more effective.
Our BiPE has superior length extrapolation capabilities across a wide range of tasks in diverse text modalities.
arXiv Detail & Related papers (2024-01-29T18:59:07Z) - Trading Positional Complexity vs. Deepness in Coordinate Networks [33.90893096003318]
We show that alternative non-Fourier embedding functions can indeed be used for positional encoding.
Their performance is entirely determined by a trade-off between the stable rank of the embedded matrix and the distance preservation between embedded coordinates.
We argue that employing a more complex positional encoding -- that scales exponentially with the number of modes -- requires only a linear (rather than deep) coordinate function to achieve comparable performance.
arXiv Detail & Related papers (2022-05-18T15:17:09Z) - Rethinking and Improving Relative Position Encoding for Vision
Transformer [61.559777439200744]
Relative position encoding (RPE) is important for transformer to capture sequence ordering of input tokens.
We propose new relative position encoding methods dedicated to 2D images, called image RPE (iRPE)
arXiv Detail & Related papers (2021-07-29T17:55:10Z) - Learnable Fourier Features for Multi-DimensionalSpatial Positional
Encoding [96.9752763607738]
We propose a novel positional encoding method based on learnable Fourier features.
Our experiments show that our learnable feature representation for multi-dimensional positional encoding outperforms existing methods.
arXiv Detail & Related papers (2021-06-05T04:40:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.