Iterative Regularization with k-support Norm: An Important Complement to Sparse Recovery
- URL: http://arxiv.org/abs/2401.05394v4
- Date: Tue, 19 Mar 2024 21:04:28 GMT
- Title: Iterative Regularization with k-support Norm: An Important Complement to Sparse Recovery
- Authors: William de Vazelhes, Bhaskar Mukhoty, Xiao-Tong Yuan, Bin Gu,
- Abstract summary: We propose a novel iterative regularization algorithm, IRKSN, based on the $k$-support norm regularizer.
We provide conditions for sparse recovery with IRKSN, and compare them with traditional conditions for recovery with $ell_1$ norm regularizers.
We also give an early stopping bound on the model error of IRKSN with explicit constants, achieving the standard linear rate for sparse recovery.
- Score: 33.26163081551751
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sparse recovery is ubiquitous in machine learning and signal processing. Due to the NP-hard nature of sparse recovery, existing methods are known to suffer either from restrictive (or even unknown) applicability conditions, or high computational cost. Recently, iterative regularization methods have emerged as a promising fast approach because they can achieve sparse recovery in one pass through early stopping, rather than the tedious grid-search used in the traditional methods. However, most of those iterative methods are based on the $\ell_1$ norm which requires restrictive applicability conditions and could fail in many cases. Therefore, achieving sparse recovery with iterative regularization methods under a wider range of conditions has yet to be further explored. To address this issue, we propose a novel iterative regularization algorithm, IRKSN, based on the $k$-support norm regularizer rather than the $\ell_1$ norm. We provide conditions for sparse recovery with IRKSN, and compare them with traditional conditions for recovery with $\ell_1$ norm regularizers. Additionally, we give an early stopping bound on the model error of IRKSN with explicit constants, achieving the standard linear rate for sparse recovery. Finally, we illustrate the applicability of our algorithm on several experiments, including a support recovery experiment with a correlated design matrix.
Related papers
- A Connection between One-Step Regularization and Critic Regularization
in Reinforcement Learning [163.44116192806922]
One-step methods perform regularization by doing just a single step of policy improvement.
critic regularization methods do many steps of policy improvement with a regularized objective.
Applying a multi-step critic regularization method with a regularization coefficient of 1 iteration yields the same policy as one-step RL.
arXiv Detail & Related papers (2023-07-24T17:46:32Z) - Linear Convergence of Reshuffling Kaczmarz Methods With Sparse
Constraints [7.936519714074615]
The Kaczmarz matrix (KZ) and its variants have been extensively studied due to their simplicity and efficiency in solving sub linear equation systems.
We provide the first theoretical convergence guarantees for KHT by showing that it converges linearly to the solution of a system with sparsity constraints.
arXiv Detail & Related papers (2023-04-20T07:14:24Z) - Restarts subject to approximate sharpness: A parameter-free and optimal scheme for first-order methods [0.6554326244334866]
Sharpness is an assumption in continuous optimization that bounds the distance from minima by objective function suboptimality.
Sharpness involves problem-specific constants that are typically unknown, and restart schemes typically reduce convergence rates.
We consider the assumption of approximate sharpness, a generalization of sharpness that incorporates an unknown constant perturbation to the objective function error.
arXiv Detail & Related papers (2023-01-05T19:01:41Z) - When Does Re-initialization Work? [50.70297319284022]
Re-initialization has been observed to improve generalization in recent works.
It is neither widely adopted in deep learning practice nor is it often used in state-of-the-art training protocols.
This raises the question of when re-initialization works, and whether it should be used together with regularization techniques.
arXiv Detail & Related papers (2022-06-20T21:23:15Z) - A Stochastic Proximal Method for Nonsmooth Regularized Finite Sum
Optimization [7.014966911550542]
We consider the problem of training a deep neural network with nonsmooth regularization to retrieve a sparse sub-structure.
We derive a new solver, called SR2, whose convergence and worst-case complexity are established without knowledge or approximation of the gradient's Lipschitz constant.
Experiments on network instances trained on CIFAR-10 and CIFAR-100 show that SR2 consistently achieves higher sparsity and accuracy than related methods such as ProxGEN and ProxSGD.
arXiv Detail & Related papers (2022-06-14T00:28:44Z) - Iterative regularization for low complexity regularizers [18.87017835436693]
Iterative regularization exploits the implicit bias of an optimization algorithm to regularize ill-posed problems.
We propose and study the first iterative regularization procedure able to handle biases described by non smooth and non strongly convex functionals.
arXiv Detail & Related papers (2022-02-01T14:09:00Z) - WARPd: A linearly convergent first-order method for inverse problems
with approximate sharpness conditions [0.0]
Sharpness conditions directly control the recovery performance of restart schemes for first-order methods.
We provide a first-order method: Weighted, Accelerated and Restarted Primal-dual (WARPd)
Under a generic approximate sharpness condition, WARPd achieves stable linear convergence to the desired vector.
We show how WARPd compares favorably with specialized state-of-the-art methods and is ideally suited for solving large-scale problems.
arXiv Detail & Related papers (2021-10-24T13:19:41Z) - The Benefits of Implicit Regularization from SGD in Least Squares
Problems [116.85246178212616]
gradient descent (SGD) exhibits strong algorithmic regularization effects in practice.
We make comparisons of the implicit regularization afforded by (unregularized) average SGD with the explicit regularization of ridge regression.
arXiv Detail & Related papers (2021-08-10T09:56:47Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Fast OSCAR and OWL Regression via Safe Screening Rules [97.28167655721766]
Ordered $L_1$ (OWL) regularized regression is a new regression analysis for high-dimensional sparse learning.
Proximal gradient methods are used as standard approaches to solve OWL regression.
We propose the first safe screening rule for OWL regression by exploring the order of the primal solution with the unknown order structure.
arXiv Detail & Related papers (2020-06-29T23:35:53Z) - Sparse Identification of Nonlinear Dynamical Systems via Reweighted
$\ell_1$-regularized Least Squares [62.997667081978825]
This work proposes an iterative sparse-regularized regression method to recover governing equations of nonlinear systems from noisy state measurements.
The aim of this work is to improve the accuracy and robustness of the method in the presence of state measurement noise.
arXiv Detail & Related papers (2020-05-27T08:30:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.