Iterative regularization for low complexity regularizers
- URL: http://arxiv.org/abs/2202.00420v1
- Date: Tue, 1 Feb 2022 14:09:00 GMT
- Title: Iterative regularization for low complexity regularizers
- Authors: Cesare Molinari and Mathurin Massias and Lorenzo Rosasco and Silvia
Villa
- Abstract summary: Iterative regularization exploits the implicit bias of an optimization algorithm to regularize ill-posed problems.
We propose and study the first iterative regularization procedure able to handle biases described by non smooth and non strongly convex functionals.
- Score: 18.87017835436693
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Iterative regularization exploits the implicit bias of an optimization
algorithm to regularize ill-posed problems. Constructing algorithms with such
built-in regularization mechanisms is a classic challenge in inverse problems
but also in modern machine learning, where it provides both a new perspective
on algorithms analysis, and significant speed-ups compared to explicit
regularization. In this work, we propose and study the first iterative
regularization procedure able to handle biases described by non smooth and non
strongly convex functionals, prominent in low-complexity regularization. Our
approach is based on a primal-dual algorithm of which we analyze convergence
and stability properties, even in the case where the original problem is
unfeasible. The general results are illustrated considering the special case of
sparse recovery with the $\ell_1$ penalty. Our theoretical results are
complemented by experiments showing the computational benefits of our approach.
Related papers
- Error Feedback under $(L_0,L_1)$-Smoothness: Normalization and Momentum [56.37522020675243]
We provide the first proof of convergence for normalized error feedback algorithms across a wide range of machine learning problems.
We show that due to their larger allowable stepsizes, our new normalized error feedback algorithms outperform their non-normalized counterparts on various tasks.
arXiv Detail & Related papers (2024-10-22T10:19:27Z) - Weakly Convex Regularisers for Inverse Problems: Convergence of Critical Points and Primal-Dual Optimisation [12.455342327482223]
We present a generalised formulation of convergent regularisation in terms of critical points.
We show that this is achieved by a class of weakly convex regularisers.
Applying this theory to learned regularisation, we prove universal approximation for input weakly convex neural networks.
arXiv Detail & Related papers (2024-02-01T22:54:45Z) - Exploring the Algorithm-Dependent Generalization of AUPRC Optimization
with List Stability [107.65337427333064]
optimization of the Area Under the Precision-Recall Curve (AUPRC) is a crucial problem for machine learning.
In this work, we present the first trial in the single-dependent generalization of AUPRC optimization.
Experiments on three image retrieval datasets on speak to the effectiveness and soundness of our framework.
arXiv Detail & Related papers (2022-09-27T09:06:37Z) - First-Order Algorithms for Nonlinear Generalized Nash Equilibrium
Problems [88.58409977434269]
We consider the problem of computing an equilibrium in a class of nonlinear generalized Nash equilibrium problems (NGNEPs)
Our contribution is to provide two simple first-order algorithmic frameworks based on the quadratic penalty method and the augmented Lagrangian method.
We provide nonasymptotic theoretical guarantees for these algorithms.
arXiv Detail & Related papers (2022-04-07T00:11:05Z) - SCORE: Approximating Curvature Information under Self-Concordant
Regularization [0.0]
We propose a generalized Gauss-Newton with Self-Concordant Regularization (GGN-SCORE) algorithm that updates the minimization speed each time it receives a new input.
The proposed algorithm exploits the structure of the second-order information in the Hessian matrix, thereby reducing computational overhead.
arXiv Detail & Related papers (2021-12-14T13:03:04Z) - The Benefits of Implicit Regularization from SGD in Least Squares
Problems [116.85246178212616]
gradient descent (SGD) exhibits strong algorithmic regularization effects in practice.
We make comparisons of the implicit regularization afforded by (unregularized) average SGD with the explicit regularization of ridge regression.
arXiv Detail & Related papers (2021-08-10T09:56:47Z) - Fractal Structure and Generalization Properties of Stochastic
Optimization Algorithms [71.62575565990502]
We prove that the generalization error of an optimization algorithm can be bounded on the complexity' of the fractal structure that underlies its generalization measure.
We further specialize our results to specific problems (e.g., linear/logistic regression, one hidden/layered neural networks) and algorithms.
arXiv Detail & Related papers (2021-06-09T08:05:36Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Iterative regularization for convex regularizers [18.87017835436693]
We study iterative regularization for linear models, when the bias is convex but not necessarily strongly convex.
We characterize the stability properties of a primal-dual gradient based approach, analyzing its convergence in the presence of worst case deterministic noise.
arXiv Detail & Related papers (2020-06-17T13:39:29Z) - Can Implicit Bias Explain Generalization? Stochastic Convex Optimization
as a Case Study [43.586269524826854]
implicit regularization refers to the tendency of the optimization generalization towards a certain structured solution that often generalizes well.
We provide a simple construction that rules out the existence of a emphdistribution-independent implicit regularizer that governs the generalization ability of Gradient Descent (SGD)
We then demonstrate a learning problem that rules out a very general class of emphdistribution-dependent implicit regularizers from explaining generalization, which includes strongly convex regularizers as well as non-degenerate norm-based regularizations.
arXiv Detail & Related papers (2020-03-13T08:40:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.