Related papers: An Uncertainty Principle for Linear Recurrent Neural Networks

An Uncertainty Principle for Linear Recurrent Neural Networks

URL: http://arxiv.org/abs/2502.09287v1
Date: Thu, 13 Feb 2025 13:01:46 GMT
Title: An Uncertainty Principle for Linear Recurrent Neural Networks
Authors: Alexandre François, Antonio Orvieto, Francis Bach,
Abstract summary: We build a linear filter of order $S$ that approximates the filter that looks $K$ time steps in the past.<n>We fully characterize the problem by providing lower bounds of approximation, as well as explicit filters that achieve this lower bound up to constants.<n>The optimal performance highlights an uncertainty principle: the filter has to average values around the $K$-th time step in the past with a range(width) that is proportional to $K/S$.
Score: 54.13281679205581
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We consider linear recurrent neural networks, which have become a key building block of sequence modeling due to their ability for stable and effective long-range modeling. In this paper, we aim at characterizing this ability on a simple but core copy task, whose goal is to build a linear filter of order $S$ that approximates the filter that looks $K$ time steps in the past (which we refer to as the shift-$K$ filter), where $K$ is larger than $S$. Using classical signal models and quadratic cost, we fully characterize the problem by providing lower bounds of approximation, as well as explicit filters that achieve this lower bound up to constants. The optimal performance highlights an uncertainty principle: the optimal filter has to average values around the $K$-th time step in the past with a range~(width) that is proportional to $K/S$.

Related papers

Autonomous Sparse Mean-CVaR Portfolio Optimization [6.358973724565783]
We propose an innovative autonomous sparse mean-CVaR portfolio model, capable of approximating the original model with arbitrary accuracy. We then propose a proximal alternating linearized minimization algorithm, coupled with a nested fixed-point proximity algorithm (both convergent) to iteratively solve the model.
arXiv Detail & Related papers (2024-05-13T15:16:22Z)
Closed-form Filtering for Non-linear Systems [83.91296397912218]
We propose a new class of filters based on Gaussian PSD Models, which offer several advantages in terms of density approximation and computational efficiency. We show that filtering can be efficiently performed in closed form when transitions and observations are Gaussian PSD Models. Our proposed estimator enjoys strong theoretical guarantees, with estimation error that depends on the quality of the approximation and is adaptive to the regularity of the transition probabilities.
arXiv Detail & Related papers (2024-02-15T08:51:49Z)
Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel. An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost. Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z)
Implicit Maximum a Posteriori Filtering via Adaptive Optimization [4.767884267554628]
We frame the standard Bayesian filtering problem as optimization over a time-varying objective. We show that our framework results in filters that are effective, robust, and scalable to high-dimensional systems.
arXiv Detail & Related papers (2023-11-17T15:30:44Z)
Blessing of Nonconvexity in Deep Linear Models: Depth Flattens the Optimization Landscape Around the True Solution [4.7464518249313805]
This work characterizes the effect of depth on the optimization landscape of regression. We show that, despite their non neurality, deeper models have more desirable optimization.
arXiv Detail & Related papers (2022-07-15T17:11:26Z)
Computational Doob's h-transforms for Online Filtering of Discretely Observed Diffusions [65.74069050283998]
We propose a computational framework to approximate Doob's $h$-transforms. The proposed approach can be orders of magnitude more efficient than state-of-the-art particle filters.
arXiv Detail & Related papers (2022-06-07T15:03:05Z)
KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal [70.15267479220691]
We consider and analyze the sample complexity of model reinforcement learning with a generative variance-free model. Our analysis shows that it is nearly minimax-optimal for finding an $varepsilon$-optimal policy when $varepsilon$ is sufficiently small.
arXiv Detail & Related papers (2022-05-27T19:39:24Z)
Minimax Optimal Quantization of Linear Models: Information-Theoretic Limits and Efficient Algorithms [59.724977092582535]
We consider the problem of quantizing a linear model learned from measurements. We derive an information-theoretic lower bound for the minimax risk under this setting. We show that our method and upper-bounds can be extended for two-layer ReLU neural networks.
arXiv Detail & Related papers (2022-02-23T02:39:04Z)
Dependency Aware Filter Pruning [74.69495455411987]
Pruning a proportion of unimportant filters is an efficient way to mitigate the inference cost. Previous work prunes filters according to their weight norms or the corresponding batch-norm scaling factors. We propose a novel mechanism to dynamically control the sparsity-inducing regularization so as to achieve the desired sparsity.
arXiv Detail & Related papers (2020-05-06T07:41:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.