Related papers: FastPart: Over-Parameterized Stochastic Gradient Descent for Sparse optimisation on Measures

FastPart: Over-Parameterized Stochastic Gradient Descent for Sparse optimisation on Measures

URL: http://arxiv.org/abs/2312.05993v1
Date: Sun, 10 Dec 2023 20:41:43 GMT
Title: FastPart: Over-Parameterized Stochastic Gradient Descent for Sparse optimisation on Measures
Authors: Yohann De Castro, S\'ebastien Gadat, Cl\'ement Marteau
Abstract summary: This paper presents a novel algorithm that leverages Gradient Descent strategies in conjunction with Random Features to augment the scalability of Conic Particle Gradient Descent (CPGD) We provide rigorous proofs demonstrating the following key findings: (i) The total variation norms of the solution measures along the descent trajectory remain bounded, ensuring stability and preventing undesirable divergence; (ii) We establish a global convergence guarantee with a convergence rate of $mathcalO(log(K)/sqrtK)$ over $K$, showcasing the efficiency and effectiveness of our algorithm; (iii) Additionally, we analyze and establish
Score: 1.9950682531209156
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: This paper presents a novel algorithm that leverages Stochastic Gradient Descent strategies in conjunction with Random Features to augment the scalability of Conic Particle Gradient Descent (CPGD) specifically tailored for solving sparse optimisation problems on measures. By formulating the CPGD steps within a variational framework, we provide rigorous mathematical proofs demonstrating the following key findings: (i) The total variation norms of the solution measures along the descent trajectory remain bounded, ensuring stability and preventing undesirable divergence; (ii) We establish a global convergence guarantee with a convergence rate of $\mathcal{O}(\log(K)/\sqrt{K})$ over $K$ iterations, showcasing the efficiency and effectiveness of our algorithm; (iii) Additionally, we analyze and establish local control over the first-order condition discrepancy, contributing to a deeper understanding of the algorithm's behavior and reliability in practical applications.

Related papers

The Distributionally Robust Optimization Model of Sparse Principal Component Analysis [7.695578200868269]
We consider sparse principal component analysis (PCA) under a setting where the underlying probability distribution of the random parameter is uncertain. This problem is formulated as a distributionally robust optimization (DRO) model based on a constructive approach to capturing uncertainty. We prove that the inner problem admits a closed-form solution, reformulating the original DRO model into an equivalent minimization problem on the Stiefel manifold.
arXiv Detail & Related papers (2025-03-04T11:00:08Z)
Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points. To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject. To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z)
A Unified Theory of Stochastic Proximal Point Methods without Smoothness [52.30944052987393]
Proximal point methods have attracted considerable interest owing to their numerical stability and robustness against imperfect tuning. This paper presents a comprehensive analysis of a broad range of variations of the proximal point method (SPPM)
arXiv Detail & Related papers (2024-05-24T21:09:19Z)
An Accelerated Block Proximal Framework with Adaptive Momentum for Nonconvex and Nonsmooth Optimization [2.323238724742687]
We propose an accelerated block proximal linear framework with adaptive momentum (ABPL$+$) for nonsmooth and nonsmooth optimization. We analyze the potential causes of the extrapolation step failing in some algorithms, and resolve this issue by enhancing the comparison process. We extend our algorithm to any scenario involving updating the gradient step and the linear extrapolation step.
arXiv Detail & Related papers (2023-08-23T13:32:31Z)
Fully Stochastic Trust-Region Sequential Quadratic Programming for Equality-Constrained Optimization Problems [62.83783246648714]
We propose a sequential quadratic programming algorithm (TR-StoSQP) to solve nonlinear optimization problems with objectives and deterministic equality constraints. The algorithm adaptively selects the trust-region radius and, compared to the existing line-search StoSQP schemes, allows us to utilize indefinite Hessian matrices.
arXiv Detail & Related papers (2022-11-29T05:52:17Z)
Stochastic Mirror Descent for Large-Scale Sparse Recovery [13.500750042707407]
We discuss an application of quadratic Approximation to statistical estimation of high-dimensional sparse parameters. We show that the proposed algorithm attains the optimal convergence of the estimation error under weak assumptions on the regressor distribution.
arXiv Detail & Related papers (2022-10-23T23:23:23Z)
Utilising the CLT Structure in Stochastic Gradient based Sampling : Improved Analysis and Faster Algorithms [14.174806471635403]
We consider approximations of sampling algorithms, such as Gradient Langevin Dynamics (SGLD) and the Random Batch Method (RBM) for Interacting Particle Dynamcs (IPD) We observe that the noise introduced by the approximation is nearly Gaussian due to the Central Limit Theorem (CLT) while the driving Brownian motion is exactly Gaussian. We harness this structure to absorb the approximation error inside the diffusion process, and obtain improved convergence guarantees for these algorithms.
arXiv Detail & Related papers (2022-06-08T10:17:40Z)
A Semismooth Newton Stochastic Proximal Point Algorithm with Variance Reduction [2.048226951354646]
We develop an implementable proximal point (SPP) method for a class of weakly convex, composite optimization problems. The proposed algorithm incorporates a variance reduction mechanism and the resulting updates are solved using an inexact semismooth Newton framework.
arXiv Detail & Related papers (2022-04-01T13:08:49Z)
Breaking the Convergence Barrier: Optimization via Fixed-Time Convergent Flows [4.817429789586127]
We introduce a Poly-based optimization framework for achieving acceleration, based on the notion of fixed-time stability dynamical systems. We validate the accelerated convergence properties of the proposed schemes on a range of numerical examples against the state-of-the-art optimization algorithms.
arXiv Detail & Related papers (2021-12-02T16:04:40Z)
Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks. We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution. We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z)
Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth Games: Convergence Analysis under Expected Co-coercivity [49.66890309455787]
We introduce the expected co-coercivity condition, explain its benefits, and provide the first last-iterate convergence guarantees of SGDA and SCO. We prove linear convergence of both methods to a neighborhood of the solution when they use constant step-size. Our convergence guarantees hold under the arbitrary sampling paradigm, and we give insights into the complexity of minibatching.
arXiv Detail & Related papers (2021-06-30T18:32:46Z)
On the Convergence of Stochastic Extragradient for Bilinear Games with Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence. We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.