Related papers: Accelerated First-Order Optimization under Nonlinear Constraints

Accelerated First-Order Optimization under Nonlinear Constraints

URL: http://arxiv.org/abs/2302.00316v2
Date: Tue, 2 Jan 2024 09:50:04 GMT
Title: Accelerated First-Order Optimization under Nonlinear Constraints
Authors: Michael Muehlebach and Michael I. Jordan
Abstract summary: We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms. An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
Score: 73.2273449996098
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We exploit analogies between first-order algorithms for constrained optimization and non-smooth dynamical systems to design a new class of accelerated first-order algorithms for constrained optimization. Unlike Frank-Wolfe or projected gradients, these algorithms avoid optimization over the entire feasible set at each iteration. We prove convergence to stationary points even in a nonconvex setting and we derive accelerated rates for the convex setting both in continuous time, as well as in discrete time. An important property of these algorithms is that constraints are expressed in terms of velocities instead of positions, which naturally leads to sparse, local and convex approximations of the feasible set (even if the feasible set is nonconvex). Thus, the complexity tends to grow mildly in the number of decision variables and in the number of constraints, which makes the algorithms suitable for machine learning applications. We apply our algorithms to a compressed sensing and a sparse regression problem, showing that we can treat nonconvex $\ell^p$ constraints ($p<1$) efficiently, while recovering state-of-the-art performance for $p=1$.

Related papers

Stochastic Smoothed Primal-Dual Algorithms for Nonconvex Optimization with Linear Inequality Constraints [12.624604051853657]
We propose smoothed primal-dual algorithms for solving nonexact optimization problems with linear inequality constraints. Our algorithms are single-loop iterations based on one gradient at each sample. Unlike existing methods, our algorithms are free sub, large sizes or increasing parameters and use dual variable updates to ensure feasibility.
arXiv Detail & Related papers (2025-04-10T09:59:43Z)
A Penalty-Based Guardrail Algorithm for Non-Decreasing Optimization with Inequality Constraints [1.5498250598583487]
Traditional mathematical programming solvers require long computational times to solve constrained minimization problems. We propose a penalty-based guardrail algorithm (PGA) to efficiently solve them.
arXiv Detail & Related papers (2024-05-03T10:37:34Z)
Accelerating Cutting-Plane Algorithms via Reinforcement Learning Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z)
Infeasible Deterministic, Stochastic, and Variance-Reduction Algorithms for Optimization under Orthogonality Constraints [9.301728976515255]
This article provides new practical and theoretical developments for the landing algorithm. First, the method is extended to the Stiefel manifold. We also consider variance reduction algorithms when the cost function is an average of many functions.
arXiv Detail & Related papers (2023-03-29T07:36:54Z)
Convergence Rates of Two-Time-Scale Gradient Descent-Ascent Dynamics for Solving Nonconvex Min-Max Problems [2.0305676256390934]
We characterize the finite-time performance of the continuous-time variant of simultaneous gradient descent-ascent algorithm. Our results on the behavior of continuous-time algorithm may be used to enhance the convergence properties of its discrete-time counterpart.
arXiv Detail & Related papers (2021-12-17T15:51:04Z)
On Constraints in First-Order Optimization: A View from Non-Smooth Dynamical Systems [99.59934203759754]
We introduce a class of first-order methods for smooth constrained optimization. Two distinctive features of our approach are that projections or optimizations over the entire feasible set are avoided. The resulting algorithmic procedure is simple to implement even when constraints are nonlinear.
arXiv Detail & Related papers (2021-07-17T11:45:13Z)
Zeroth and First Order Stochastic Frank-Wolfe Algorithms for Constrained Optimization [13.170519806372075]
Problems of convex optimization with two sets of constraints arise frequently in the context of semidefinite programming. Since projection onto the first set of constraints is difficult, it becomes necessary to explore projection-free algorithms. The efficacy of the proposed algorithms is tested on relevant applications of sparse matrix estimation, clustering via semidefinite relaxation, and uniform sparsest cut problem.
arXiv Detail & Related papers (2021-07-14T08:01:30Z)
Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step. Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z)
Convergence of adaptive algorithms for weakly convex constrained optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope. Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
Private Stochastic Convex Optimization: Optimal Rates in Linear Time [74.47681868973598]
We study the problem of minimizing the population loss given i.i.d. samples from a distribution over convex loss functions. A recent work of Bassily et al. has established the optimal bound on the excess population loss achievable given $n$ samples. We describe two new techniques for deriving convex optimization algorithms both achieving the optimal bound on excess loss and using $O(minn, n2/d)$ gradient computations.
arXiv Detail & Related papers (2020-05-10T19:52:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.