Learning the optimal regularizer for inverse problems
- URL: http://arxiv.org/abs/2106.06513v1
- Date: Fri, 11 Jun 2021 17:14:27 GMT
- Title: Learning the optimal regularizer for inverse problems
- Authors: Giovanni S. Alberti, Ernesto De Vito, Matti Lassas, Luca Ratti, Matteo
Santacesaria
- Abstract summary: We consider the linear inverse problem $y=Ax+epsilon$, where $Acolon Xto Y$ is a known linear operator between the separable Hilbert spaces $X$ and $Y$.
This setting covers several inverse problems in imaging including denoising, deblurring, and X-ray tomography.
Within the classical framework of regularization, we focus on the case where the regularization functional is not given a priori but learned from data.
- Score: 1.763934678295407
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we consider the linear inverse problem $y=Ax+\epsilon$, where
$A\colon X\to Y$ is a known linear operator between the separable Hilbert
spaces $X$ and $Y$, $x$ is a random variable in $X$ and $\epsilon$ is a
zero-mean random process in $Y$. This setting covers several inverse problems
in imaging including denoising, deblurring, and X-ray tomography. Within the
classical framework of regularization, we focus on the case where the
regularization functional is not given a priori but learned from data. Our
first result is a characterization of the optimal generalized Tikhonov
regularizer, with respect to the mean squared error. We find that it is
completely independent of the forward operator $A$ and depends only on the mean
and covariance of $x$. Then, we consider the problem of learning the
regularizer from a finite training set in two different frameworks: one
supervised, based on samples of both $x$ and $y$, and one unsupervised, based
only on samples of $x$. In both cases, we prove generalization bounds, under
some weak assumptions on the distribution of $x$ and $\epsilon$, including the
case of sub-Gaussian variables. Our bounds hold in infinite-dimensional spaces,
thereby showing that finer and finer discretizations do not make this learning
problem harder. The results are validated through numerical simulations.
Related papers
- A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Beyond Moments: Robustly Learning Affine Transformations with
Asymptotically Optimal Error [8.615625517708324]
We present a-time algorithm for learning an unknown affine transformation of the standard hypercube from samples.
Our algorithm is based on a new method that iteratively improves an estimate of the unknown affine transformation whenever the requirements of the certificate are not met.
arXiv Detail & Related papers (2023-02-23T19:13:30Z) - Horizon-Free and Variance-Dependent Reinforcement Learning for Latent
Markov Decision Processes [62.90204655228324]
We study regret minimization for reinforcement learning (RL) in Latent Markov Decision Processes (LMDPs) with context in hindsight.
We design a novel model-based algorithmic framework which can be instantiated with both a model-optimistic and a value-optimistic solver.
arXiv Detail & Related papers (2022-10-20T21:32:01Z) - Variance-Aware Sparse Linear Bandits [64.70681598741417]
Worst-case minimax regret for sparse linear bandits is $widetildeThetaleft(sqrtdTright)$.
In the benign setting where there is no noise and the action set is the unit sphere, one can use divide-and-conquer to achieve an $widetildemathcal O(1)$ regret.
We develop a general framework that converts any variance-aware linear bandit algorithm to a variance-aware algorithm for sparse linear bandits.
arXiv Detail & Related papers (2022-05-26T15:55:44Z) - Optimal Spectral Recovery of a Planted Vector in a Subspace [80.02218763267992]
We study efficient estimation and detection of a planted vector $v$ whose $ell_4$ norm differs from that of a Gaussian vector with the same $ell$ norm.
We show that in the regime $n rho gg sqrtN$, any spectral method from a large class (and more generally, any low-degree of the input) fails to detect the planted vector.
arXiv Detail & Related papers (2021-05-31T16:10:49Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Faster Uncertainty Quantification for Inverse Problems with Conditional
Normalizing Flows [0.9176056742068814]
In inverse problems, we often have data consisting of paired samples $(x,y)sim p_X,Y(x,y)$ where $y$ are partial observations of a physical system.
We propose a two-step scheme, which makes use of normalizing flows and joint data to train a conditional generator $q_theta(x|y)$.
arXiv Detail & Related papers (2020-07-15T20:36:30Z) - How isotropic kernels perform on simple invariants [0.5729426778193397]
We investigate how the training curve of isotropic kernel methods depends on the symmetry of the task to be learned.
We show that for large bandwidth, $beta = fracd-1+xi3d-3+xi$, where $xiin (0,2)$ is the exponent characterizing the stripe of the kernel at the origin.
arXiv Detail & Related papers (2020-06-17T09:59:18Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z) - Does generalization performance of $l^q$ regularization learning depend
on $q$? A negative example [19.945160684285003]
$lq$-regularization has been demonstrated to be an attractive technique in machine learning and statistical modeling.
We show that all $lq$ estimators for $0 infty$ attain similar generalization error bounds.
This finding tentatively reveals that, in some modeling contexts, the choice of $q$ might not have a strong impact in terms of the generalization capability.
arXiv Detail & Related papers (2013-07-25T00:48:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.