Related papers: Preconditioned subgradient method for composite optimization: overparameterization and fast convergence

Preconditioned subgradient method for composite optimization: overparameterization and fast convergence

URL: http://arxiv.org/abs/2509.11486v2
Date: Thu, 02 Oct 2025 22:58:30 GMT
Title: Preconditioned subgradient method for composite optimization: overparameterization and fast convergence
Authors: Mateo Díaz, Liwei Jiang, Abdel Ghani Labassi,
Abstract summary: Composite optimization problems involve minimizing the composition of a smooth map with a convex function.<n>The subgradient method achieves local linear convergence when the composite loss is well-conditioned.<n>We introduce a Levenberg-Morrison-Marquardt subgradient method that converges linearly under mild regularity conditions.
Score: 16.437253140200788
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Composite optimization problems involve minimizing the composition of a smooth map with a convex function. Such objectives arise in numerous data science and signal processing applications, including phase retrieval, blind deconvolution, and collaborative filtering. The subgradient method achieves local linear convergence when the composite loss is well-conditioned. However, if the smooth map is, in a certain sense, ill-conditioned or overparameterized, the subgradient method exhibits much slower sublinear convergence even when the convex function is well-conditioned. To overcome this limitation, we introduce a Levenberg-Morrison-Marquardt subgradient method that converges linearly under mild regularity conditions at a rate determined solely by the convex function. Further, we demonstrate that these regularity conditions hold for several problems of practical interest, including square-variable formulations, matrix sensing, and tensor factorization. Numerical experiments illustrate the benefits of our method.

Related papers

Revisiting Convergence: Shuffling Complexity Beyond Lipschitz Smoothness [50.78508362183774]
Shuffling-type gradient methods are favored in practice for their simplicity and rapid empirical performance.<n>Most require the Lipschitz condition, which is often not met in common machine learning schemes.
arXiv Detail & Related papers (2025-07-11T15:36:48Z)
Low-Rank Extragradient Methods for Scalable Semidefinite Optimization [17.384717824118255]
We focus on high-dimensional and plausible settings in which the problem admits a low-rank solution.<n>We provide several theoretical results proving that, under these circumstances, the well-known Extragradient method converges to a solution of the constrained optimization problem.
arXiv Detail & Related papers (2024-02-14T10:48:00Z)
Over-the-Air Computation Aided Federated Learning with the Aggregation of Normalized Gradient [12.692064367193934]
Over-the-air computation is a communication-efficient solution for federated learning (FL) In such a system, iterative procedure of private loss function is updated, and then transmitted by every mobile device. To circumvent this problem, we propose to turn local gradient to be normalized one before amplifying it.
arXiv Detail & Related papers (2023-08-17T16:15:47Z)
Convex and Non-convex Optimization Under Generalized Smoothness [69.69521650503431]
An analysis of convex and non- optimization methods often requires the Lipsitzness gradient, which limits the analysis by this trajectorys. Recent work generalizes the gradient setting via the non-uniform smoothness condition.
arXiv Detail & Related papers (2023-06-02T04:21:59Z)
Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets. We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z)
Stochastic Gradient Methods with Preconditioned Updates [47.23741709751474]
There are several algorithms for such problems, but existing methods often work poorly when badly scaled and/or ill-conditioned. Here we include preconditionimater based on Hutchinson's approach to approxing the diagonal Hessian. We prove convergence both when smoothness and the PL condition are assumed.
arXiv Detail & Related papers (2022-06-01T07:38:08Z)
Faster One-Sample Stochastic Conditional Gradient Method for Composite Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms. The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z)
On Asymptotic Linear Convergence of Projected Gradient Descent for Constrained Least Squares [22.851500417035947]
This manuscript presents a unified framework for the analysis of projected gradient descent in the context of constrained least squares. We present a recipe for the convergence analysis of PGD and demonstrate it via a beginning-to-end application of the recipe on four fundamental problems.
arXiv Detail & Related papers (2021-12-22T09:49:51Z)
Constrained and Composite Optimization via Adaptive Sampling Methods [3.4219044933964944]
The motivation for this paper stems from the desire to develop an adaptive sampling method for solving constrained optimization problems. The method proposed in this paper is a proximal gradient method that can also be applied to the composite optimization problem min f(x) + h(x), where f is convex (but not necessarily differentiable)
arXiv Detail & Related papers (2020-12-31T02:50:39Z)
Random extrapolation for primal-dual coordinate descent [61.55967255151027]
We introduce a randomly extrapolated primal-dual coordinate descent method that adapts to sparsity of the data matrix and the favorable structures of the objective function. We show almost sure convergence of the sequence and optimal sublinear convergence rates for the primal-dual gap and objective values, in the general convex-concave case.
arXiv Detail & Related papers (2020-07-13T17:39:35Z)
Conditional gradient methods for stochastically constrained convex minimization [54.53786593679331]
We propose two novel conditional gradient-based methods for solving structured convex optimization problems. The most important feature of our framework is that only a subset of the constraints is processed at each iteration. Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees.
arXiv Detail & Related papers (2020-07-07T21:26:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.