Related papers: Trading Positional Complexity vs. Deepness in Coordinate Networks

Trading Positional Complexity vs. Deepness in Coordinate Networks

URL: http://arxiv.org/abs/2205.08987v1
Date: Wed, 18 May 2022 15:17:09 GMT
Title: Trading Positional Complexity vs. Deepness in Coordinate Networks
Authors: Jianqiao Zheng, Sameera Ramasinghe, Xueqian Li, Simon Lucey
Abstract summary: We show that alternative non-Fourier embedding functions can indeed be used for positional encoding. Their performance is entirely determined by a trade-off between the stable rank of the embedded matrix and the distance preservation between embedded coordinates. We argue that employing a more complex positional encoding -- that scales exponentially with the number of modes -- requires only a linear (rather than deep) coordinate function to achieve comparable performance.
Score: 33.90893096003318
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: It is well noted that coordinate-based MLPs benefit -- in terms of preserving high-frequency information -- through the encoding of coordinate positions as an array of Fourier features. Hitherto, the rationale for the effectiveness of these positional encodings has been mainly studied through a Fourier lens. In this paper, we strive to broaden this understanding by showing that alternative non-Fourier embedding functions can indeed be used for positional encoding. Moreover, we show that their performance is entirely determined by a trade-off between the stable rank of the embedded matrix and the distance preservation between embedded coordinates. We further establish that the now ubiquitous Fourier feature mapping of position is a special case that fulfills these conditions. Consequently, we present a more general theory to analyze positional encoding in terms of shifted basis functions. In addition, we argue that employing a more complex positional encoding -- that scales exponentially with the number of modes -- requires only a linear (rather than deep) coordinate function to achieve comparable performance. Counter-intuitively, we demonstrate that trading positional embedding complexity for network deepness is orders of magnitude faster than current state-of-the-art; despite the additional embedding complexity. To this end, we develop the necessary theoretical formulae and empirically verify that our theoretical claims hold in practice.

Related papers

SeqPE: Transformer with Sequential Position Encoding [76.22159277300891]
SeqPE represents each $n$-dimensional position index as a symbolic sequence and employs a lightweight sequential position encoder to learn their embeddings.<n> Experiments across language modeling, long-context question answering, and 2D image classification demonstrate that SeqPE not only surpasses strong baselines in perplexity, exact match (EM) and accuracy--but also enables seamless generalization to multi-dimensional inputs without requiring manual architectural redesign.
arXiv Detail & Related papers (2025-06-16T09:16:40Z)
Learnable Spatial-Temporal Positional Encoding for Link Prediction [44.0907827498725]
We propose a simple temporal link prediction model named L-STEP.<n>L-STEP can preserve the graph property from the spatial-temporal spectral viewpoint.<n>L-STEP obtains the leading performance in the newest large-scale TGB benchmark.
arXiv Detail & Related papers (2025-06-10T00:35:53Z)
Improving Transformers using Faithful Positional Encoding [55.30212768657544]
We propose a new positional encoding method for a neural network architecture called the Transformer. Unlike the standard sinusoidal positional encoding, our approach has a guarantee of not losing information about the positional order of the input sequence.
arXiv Detail & Related papers (2024-05-15T03:17:30Z)
Positional Encoding Helps Recurrent Neural Networks Handle a Large Vocabulary [1.4594704809280983]
Positional encoding is a high-dimensional representation of time indices on input data. RNNs can encode the temporal information of data points on their own, rendering their use of positional encoding seemingly redundant/unnecessary.
arXiv Detail & Related papers (2024-01-31T23:32:20Z)
Coordinate Quantized Neural Implicit Representations for Multi-view Reconstruction [28.910183274743872]
We introduce neural implicit representations with quantized coordinates, which reduces the uncertainty and ambiguity in the field during optimization. We use discrete coordinates and their positional encodings to learn implicit functions through volume rendering. Our evaluations under the widely used benchmarks show our superiority over the state-of-the-art.
arXiv Detail & Related papers (2023-08-21T20:27:33Z)
NAF: Neural Attenuation Fields for Sparse-View CBCT Reconstruction [79.13750275141139]
This paper proposes a novel and fast self-supervised solution for sparse-view CBCT reconstruction. The desired attenuation coefficients are represented as a continuous function of 3D spatial coordinates, parameterized by a fully-connected deep neural network. A learning-based encoder entailing hash coding is adopted to help the network capture high-frequency details.
arXiv Detail & Related papers (2022-09-29T04:06:00Z)
Rethinking Positional Encoding [31.80055086317266]
We show that alternative non-Fourier embedding functions can indeed be used for positional encoding. We show that their performance is entirely determined by a trade-off between the stable rank of the embedded matrix and the distance preservation between embedded coordinates. We present a more general theory to analyze positional encoding in terms of shifted basis functions.
arXiv Detail & Related papers (2021-07-06T12:04:04Z)
Compressing Deep ODE-Nets using Basis Function Expansions [105.05435207079759]
We consider formulations of the weights as continuous-depth functions using linear combinations of basis functions. This perspective allows us to compress the weights through a change of basis, without retraining, while maintaining near state-of-the-art performance. In turn, both inference time and the memory footprint are reduced, enabling quick and rigorous adaptation between computational environments.
arXiv Detail & Related papers (2021-06-21T03:04:51Z)
Learnable Fourier Features for Multi-DimensionalSpatial Positional Encoding [96.9752763607738]
We propose a novel positional encoding method based on learnable Fourier features. Our experiments show that our learnable feature representation for multi-dimensional positional encoding outperforms existing methods.
arXiv Detail & Related papers (2021-06-05T04:40:18Z)
On Approximation in Deep Convolutional Networks: a Kernel Perspective [12.284934135116515]
We study the success of deep convolutional networks on tasks involving high-dimensional data such as images or audio. We study this theoretically and empirically through the lens of kernel methods, by considering multi-layer convolutional kernels. We find that while expressive kernels operating on input patches are important at the first layer, simpler kernels can suffice in higher layers for good performance.
arXiv Detail & Related papers (2021-02-19T17:03:42Z)
MetaSDF: Meta-learning Signed Distance Functions [85.81290552559817]
Generalizing across shapes with neural implicit representations amounts to learning priors over the respective function space. We formalize learning of a shape space as a meta-learning problem and leverage gradient-based meta-learning algorithms to solve this task.
arXiv Detail & Related papers (2020-06-17T05:14:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.