Neurally Augmented ALISTA
- URL: http://arxiv.org/abs/2010.01930v1
- Date: Mon, 5 Oct 2020 11:39:49 GMT
- Title: Neurally Augmented ALISTA
- Authors: Freya Behrens, Jonathan Sauder and Peter Jung
- Abstract summary: We introduce Neurally Augmented ALISTA, in which an LSTM network is used to compute step sizes and thresholds individually for each target vector during reconstruction.
We show that our approach further improves empirical performance in sparse reconstruction, in particular outperforming existing algorithms by an increasing margin as the compression ratio becomes more challenging.
- Score: 15.021419552695066
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: It is well-established that many iterative sparse reconstruction algorithms
can be unrolled to yield a learnable neural network for improved empirical
performance. A prime example is learned ISTA (LISTA) where weights, step sizes
and thresholds are learned from training data. Recently, Analytic LISTA
(ALISTA) has been introduced, combining the strong empirical performance of a
fully learned approach like LISTA, while retaining theoretical guarantees of
classical compressed sensing algorithms and significantly reducing the number
of parameters to learn. However, these parameters are trained to work in
expectation, often leading to suboptimal reconstruction of individual targets.
In this work we therefore introduce Neurally Augmented ALISTA, in which an LSTM
network is used to compute step sizes and thresholds individually for each
target vector during reconstruction. This adaptive approach is theoretically
motivated by revisiting the recovery guarantees of ALISTA. We show that our
approach further improves empirical performance in sparse reconstruction, in
particular outperforming existing algorithms by an increasing margin as the
compression ratio becomes more challenging.
Related papers
- Precise asymptotics of reweighted least-squares algorithms for linear diagonal networks [15.074950361970194]
We provide a unified analysis for a family of algorithms that encompasses IRLS, the recently proposed linlin-RFM algorithm, and the alternating diagonal neural networks.
We show that, with appropriately chosen reweighting policy, a handful of sparse structures can achieve favorable performance.
We also show that leveraging this in the reweighting scheme provably improves test error compared to coordinate-wise reweighting.
arXiv Detail & Related papers (2024-06-04T20:37:17Z) - REBEL: Reinforcement Learning via Regressing Relative Rewards [59.68420022466047]
We propose REBEL, a minimalist RL algorithm for the era of generative models.
In theory, we prove that fundamental RL algorithms like Natural Policy Gradient can be seen as variants of REBEL.
We find that REBEL provides a unified approach to language modeling and image generation with stronger or similar performance as PPO and DPO.
arXiv Detail & Related papers (2024-04-25T17:20:45Z) - Loop Unrolled Shallow Equilibrium Regularizer (LUSER) -- A
Memory-Efficient Inverse Problem Solver [26.87738024952936]
In inverse problems we aim to reconstruct some underlying signal of interest from potentially corrupted and often ill-posed measurements.
We propose an LU algorithm with shallow equilibrium regularizers (L)
These implicit models are as expressive as deeper convolutional networks, but far more memory efficient during training.
arXiv Detail & Related papers (2022-10-10T19:50:37Z) - Improved Algorithms for Neural Active Learning [74.89097665112621]
We improve the theoretical and empirical performance of neural-network(NN)-based active learning algorithms for the non-parametric streaming setting.
We introduce two regret metrics by minimizing the population loss that are more suitable in active learning than the one used in state-of-the-art (SOTA) related work.
arXiv Detail & Related papers (2022-10-02T05:03:38Z) - An intelligent algorithmic trading based on a risk-return reinforcement
learning algorithm [0.0]
This scientific paper propose a novel portfolio optimization model using an improved deep reinforcement learning algorithm.
The proposed algorithm is based on actor-critic architecture, in which the main task of critical network is to learn the distribution of portfolio cumulative return.
A multi-process method is used, called Ape-x, to accelerate the speed of deep reinforcement learning training.
arXiv Detail & Related papers (2022-08-23T03:20:06Z) - Hybrid ISTA: Unfolding ISTA With Convergence Guarantees Using Free-Form
Deep Neural Networks [50.193061099112626]
It is promising to solve linear inverse problems by unfolding iterative algorithms as deep neural networks (DNNs) with learnable parameters.
Existing ISTA-based unfolded algorithms restrict the network architectures for iterative updates with the partial weight coupling structure to guarantee convergence.
This paper is the first to provide a convergence-provable framework that enables free-form DNNs in ISTA-based unfolded algorithms.
arXiv Detail & Related papers (2022-04-25T13:17:57Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - Towards Understanding Label Smoothing [36.54164997035046]
Label smoothing regularization (LSR) has a great success in deep neural networks by training algorithms.
We show that an appropriate LSR can help to speed up convergence by reducing the variance.
We propose a simple yet effective strategy, namely Two-Stage LAbel smoothing algorithm (TSLA)
arXiv Detail & Related papers (2020-06-20T20:36:17Z) - Reparameterized Variational Divergence Minimization for Stable Imitation [57.06909373038396]
We study the extent to which variations in the choice of probabilistic divergence may yield more performant ILO algorithms.
We contribute a re parameterization trick for adversarial imitation learning to alleviate the challenges of the promising $f$-divergence minimization framework.
Empirically, we demonstrate that our design choices allow for ILO algorithms that outperform baseline approaches and more closely match expert performance in low-dimensional continuous-control tasks.
arXiv Detail & Related papers (2020-06-18T19:04:09Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.