Bilevel learning of l1-regularizers with closed-form gradients(BLORC)
- URL: http://arxiv.org/abs/2111.10858v1
- Date: Sun, 21 Nov 2021 17:01:29 GMT
- Title: Bilevel learning of l1-regularizers with closed-form gradients(BLORC)
- Authors: Avrajit Ghosh, Michael T. Mccann, Saiprasad Ravishankar
- Abstract summary: We present a method for supervised learning of sparsity-promoting regularizers.
The parameters are learned to minimize the mean squared error of reconstruction on a training set of ground truth signal and measurement pairs.
- Score: 8.138650738423722
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a method for supervised learning of sparsity-promoting
regularizers, a key ingredient in many modern signal reconstruction problems.
The parameters of the regularizer are learned to minimize the mean squared
error of reconstruction on a training set of ground truth signal and
measurement pairs. Training involves solving a challenging bilevel optimization
problem with a nonsmooth lower-level objective. We derive an expression for the
gradient of the training loss using the implicit closed-form solution of the
lower-level variational problem given by its dual problem, and provide an
accompanying gradient descent algorithm (dubbed BLORC) to minimize the loss.
Our experiments on simple natural images and for denoising 1D signals show that
the proposed method can learn meaningful operators and the analytical gradients
calculated are faster than standard automatic differentiation methods. While
the approach we present is applied to denoising, we believe that it can be
adapted to a wide-variety of inverse problems with linear measurement models,
thus giving it applicability in a wide range of scenarios.
Related papers
- Gradient-Variation Online Learning under Generalized Smoothness [56.38427425920781]
gradient-variation online learning aims to achieve regret guarantees that scale with variations in gradients of online functions.
Recent efforts in neural network optimization suggest a generalized smoothness condition, allowing smoothness to correlate with gradient norms.
We provide the applications for fast-rate convergence in games and extended adversarial optimization.
arXiv Detail & Related papers (2024-08-17T02:22:08Z) - Aiming towards the minimizers: fast convergence of SGD for
overparametrized problems [25.077446336619378]
We propose a regularity regime which endows the gradient method with the same worst-case complexity as the gradient method.
All existing guarantees require the gradient method to take small steps, thereby resulting in a much slower linear rate of convergence.
We demonstrate that our condition holds when training sufficiently wide feedforward neural networks with a linear output layer.
arXiv Detail & Related papers (2023-06-05T05:21:01Z) - Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - An Accelerated Doubly Stochastic Gradient Method with Faster Explicit
Model Identification [97.28167655721766]
We propose a novel doubly accelerated gradient descent (ADSGD) method for sparsity regularized loss minimization problems.
We first prove that ADSGD can achieve a linear convergence rate and lower overall computational complexity.
arXiv Detail & Related papers (2022-08-11T22:27:22Z) - Learning Sparsity-Promoting Regularizers using Bilevel Optimization [9.18465987536469]
We present a method for supervised learning of sparsity-promoting regularizers for denoising signals and images.
Experiments with structured 1D signals and natural images show that the proposed method can learn an operator that outperforms well-known regularizers.
arXiv Detail & Related papers (2022-07-18T20:50:02Z) - Continuous-Time Meta-Learning with Forward Mode Differentiation [65.26189016950343]
We introduce Continuous Meta-Learning (COMLN), a meta-learning algorithm where adaptation follows the dynamics of a gradient vector field.
Treating the learning process as an ODE offers the notable advantage that the length of the trajectory is now continuous.
We show empirically its efficiency in terms of runtime and memory usage, and we illustrate its effectiveness on a range of few-shot image classification problems.
arXiv Detail & Related papers (2022-03-02T22:35:58Z) - Solving Linear Inverse Problems Using the Prior Implicit in a Denoiser [7.7288480250888]
We develop a robust and general methodology for making use of implicit priors in deep neural networks.
A CNN trained to perform blind (i.e., with unknown noise level) least-squares denoising is presented.
A generalization of this algorithm to constrained sampling provides a method for using the implicit prior to solve any linear inverse problem.
arXiv Detail & Related papers (2020-07-27T15:40:46Z) - Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem.
We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent.
Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z) - Supervised Learning of Sparsity-Promoting Regularizers for Denoising [13.203765985718205]
We present a method for supervised learning of sparsity-promoting regularizers for image denoising.
Our experiments show that the proposed method can learn an operator that outperforms well-known regularizers.
arXiv Detail & Related papers (2020-06-09T21:38:05Z) - Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand.
We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice.
Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.