Small noise analysis for Tikhonov and RKHS regularizations
- URL: http://arxiv.org/abs/2305.11055v2
- Date: Wed, 4 Sep 2024 02:24:25 GMT
- Title: Small noise analysis for Tikhonov and RKHS regularizations
- Authors: Quanjun Lang, Fei Lu,
- Abstract summary: We establish a small noise analysis framework to assess the effects of norms in Tikhonov and RKHS regularizations.
This framework studies the convergence rates of regularized estimators in the small noise limit and reveals the potential instability of the conventional L2-regularizer.
A surprising insight is that over-smoothing via these fractional RKHSs consistently yields optimal convergence rates.
- Score: 0.8133739801185272
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Regularization plays a pivotal role in ill-posed machine learning and inverse problems. However, the fundamental comparative analysis of various regularization norms remains open. We establish a small noise analysis framework to assess the effects of norms in Tikhonov and RKHS regularizations, in the context of ill-posed linear inverse problems with Gaussian noise. This framework studies the convergence rates of regularized estimators in the small noise limit and reveals the potential instability of the conventional L2-regularizer. We solve such instability by proposing an innovative class of adaptive fractional RKHS regularizers, which covers the L2 Tikhonov and RKHS regularizations by adjusting the fractional smoothness parameter. A surprising insight is that over-smoothing via these fractional RKHSs consistently yields optimal convergence rates, but the optimal hyper-parameter may decay too fast to be selected in practice.
Related papers
- Breaking the Heavy-Tailed Noise Barrier in Stochastic Optimization Problems [56.86067111855056]
We consider clipped optimization problems with heavy-tailed noise with structured density.
We show that it is possible to get faster rates of convergence than $mathcalO(K-(alpha - 1)/alpha)$, when the gradients have finite moments of order.
We prove that the resulting estimates have negligible bias and controllable variance.
arXiv Detail & Related papers (2023-11-07T17:39:17Z) - Distributionally Robust Optimization with Bias and Variance Reduction [9.341215359733601]
We show that Prospect, a gradient-based algorithm, enjoys linear convergence for smooth regularized losses.
We also show that Prospect can converge 2-3$times$ faster than baselines such as gradient-based methods.
arXiv Detail & Related papers (2023-10-21T00:03:54Z) - Tradeoffs between convergence rate and noise amplification for momentum-based accelerated optimization algorithms [8.669461942767098]
We study momentum-based first-order optimization algorithms in which the iterations are subject to an additive white noise.
For strongly convex quadratic problems, we use the steady-state variance of the error in the optimization variable to quantify noise amplification.
We introduce two parameterized families of algorithms that strike a balance between noise amplification and settling time.
arXiv Detail & Related papers (2022-09-24T04:26:30Z) - Clipped Stochastic Methods for Variational Inequalities with
Heavy-Tailed Noise [64.85879194013407]
We prove the first high-probability results with logarithmic dependence on the confidence level for methods for solving monotone and structured non-monotone VIPs.
Our results match the best-known ones in the light-tails case and are novel for structured non-monotone problems.
In addition, we numerically validate that the gradient noise of many practical formulations is heavy-tailed and show that clipping improves the performance of SEG/SGDA.
arXiv Detail & Related papers (2022-06-02T15:21:55Z) - Hyperspectral Image Denoising Using Non-convex Local Low-rank and Sparse
Separation with Spatial-Spectral Total Variation Regularization [49.55649406434796]
We propose a novel non particular approach to robust principal component analysis for HSI denoising.
We develop accurate approximations to both rank and sparse components.
Experiments on both simulated and real HSIs demonstrate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-01-08T11:48:46Z) - Minimax Supervised Clustering in the Anisotropic Gaussian Mixture Model:
A new take on Robust Interpolation [5.98367009147573]
We study the supervised clustering problem under the two-component anisotropic Gaussian mixture model.
We show that in the high-dimensional regime, the linear discriminant analysis (LDA) classifier turns out to be sub-optimal in the minimax sense.
arXiv Detail & Related papers (2021-11-13T05:19:37Z) - Optimizing Information-theoretical Generalization Bounds via Anisotropic
Noise in SGLD [73.55632827932101]
We optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD.
We prove that with constraint to guarantee low empirical risk, the optimal noise covariance is the square root of the expected gradient covariance.
arXiv Detail & Related papers (2021-10-26T15:02:27Z) - Interpolation can hurt robust generalization even when there is no noise [76.3492338989419]
We show that avoiding generalization through ridge regularization can significantly improve generalization even in the absence of noise.
We prove this phenomenon for the robust risk of both linear regression and classification and hence provide the first theoretical result on robust overfitting.
arXiv Detail & Related papers (2021-08-05T23:04:15Z) - Beyond Tikhonov: Faster Learning with Self-Concordant Losses via
Iterative Regularization [120.31448970413298]
We extend the theory of Tikhonov regularization to generalized self concordant loss functions.
We show that fast and optimal rates can be achieved for GSC by using the iterated Tikhonov regularization scheme.
arXiv Detail & Related papers (2021-06-16T15:25:41Z) - Analysis of Regularized Least Squares in Reproducing Kernel Krein Spaces [35.091152823482105]
We study the properties of regularized least squares with indefinite kernels in kernel Krein spaces.
We derive learning rates in kernel Krein spaces that are the same as that in kernel Hilbert spaces.
arXiv Detail & Related papers (2020-06-01T16:55:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.