Analyzing and Improving Greedy 2-Coordinate Updates for
Equality-Constrained Optimization via Steepest Descent in the 1-Norm
- URL: http://arxiv.org/abs/2307.01169v1
- Date: Mon, 3 Jul 2023 17:27:18 GMT
- Title: Analyzing and Improving Greedy 2-Coordinate Updates for
Equality-Constrained Optimization via Steepest Descent in the 1-Norm
- Authors: Amrutha Varshini Ramesh, Aaron Mishkin, Mark Schmidt, Yihan Zhou,
Jonathan Wilder Lavington, Jennifer She
- Abstract summary: We exploit a connection between the greedy 2-coordinate update for this problem and equality-constrained steepest descent in the 1-norm.
We then consider minimizing with both a summation constraint and bound constraints, as arises in the support vector machine dual problem.
- Score: 12.579063422860072
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider minimizing a smooth function subject to a summation constraint
over its variables. By exploiting a connection between the greedy 2-coordinate
update for this problem and equality-constrained steepest descent in the
1-norm, we give a convergence rate for greedy selection under a proximal
Polyak-Lojasiewicz assumption that is faster than random selection and
independent of the problem dimension $n$. We then consider minimizing with both
a summation constraint and bound constraints, as arises in the support vector
machine dual problem. Existing greedy rules for this setting either guarantee
trivial progress only or require $O(n^2)$ time to compute. We show that bound-
and summation-constrained steepest descent in the L1-norm guarantees more
progress per iteration than previous rules and can be computed in only $O(n
\log n)$ time.
Related papers
- Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
We propose two variance reduced ZO estimators for complex gradient problems.
We improve the state-of-the-art function complexities from $mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$ to $tildecalOleft(fracdepsilon2right)$.
arXiv Detail & Related papers (2024-10-03T15:04:01Z) - First-Order Methods for Linearly Constrained Bilevel Optimization [38.19659447295665]
We present first-order linearly constrained optimization methods for high-level Hessian computations.
For linear inequality constraints, we attain $(delta,epsilon)$-Goldstein stationarity in $widetildeO(ddelta-1 epsilon-3)$ gradient oracle calls.
arXiv Detail & Related papers (2024-06-18T16:41:21Z) - Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Strictly Low Rank Constraint Optimization -- An Asymptotically
$\mathcal{O}(\frac{1}{t^2})$ Method [5.770309971945476]
We propose a class of non-text and non-smooth problems with textitrank regularization to promote sparsity in optimal solution.
We show that our algorithms are able to achieve a singular convergence of $Ofrac(t2)$, which is exactly same as Nesterov's optimal convergence for first-order methods on smooth convex problems.
arXiv Detail & Related papers (2023-07-04T16:55:41Z) - An efficient algorithm for the $\ell_{p}$ norm based metric nearness
problem [4.700135257490953]
We propose a delayed constraint generation method with each subproblem solved by the semismooth Newton based proximal augmented Lagrangian method (PALM) for the metric nearness problem.
A pleasing aspect of our algorithm is that we can solve these problems involving up to $108$ variables and $1013$ constraints.
arXiv Detail & Related papers (2022-11-02T16:29:05Z) - Finite-Time Error Bounds for Greedy-GQ [20.51105692499517]
We show that Greedy-GQ algorithm converges fast as finite-time error.
Our analysis provides for faster convergence step-sizes for choosing step-sizes.
arXiv Detail & Related papers (2022-09-06T15:04:57Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Sharper Convergence Guarantees for Asynchronous SGD for Distributed and
Federated Learning [77.22019100456595]
We show a training algorithm for distributed computation workers with varying communication frequency.
In this work, we obtain a tighter convergence rate of $mathcalO!!!(sigma2-2_avg!! .
We also show that the heterogeneity term in rate is affected by the average delay within each worker.
arXiv Detail & Related papers (2022-06-16T17:10:57Z) - The Complexity of Nonconvex-Strongly-Concave Minimax Optimization [43.07732143522183]
This paper establishes the complexity for finding approximate stationary points of non-strongly-concave (NC-SC) smooth minimax problems.
We deploy a proposed sequence of $Omega-strong$lyconcave sub-2 problems in both general complexity and averaged complexity.
In our proposed finite-sum setting, our proposed algorithm provides a nearly-tight dependence on the condition number.
arXiv Detail & Related papers (2021-03-29T18:53:57Z) - A Momentum-Assisted Single-Timescale Stochastic Approximation Algorithm
for Bilevel Optimization [112.59170319105971]
We propose a new algorithm -- the Momentum- Single-timescale Approximation (MSTSA) -- for tackling problems.
MSTSA allows us to control the error in iterations due to inaccurate solution to the lower level subproblem.
arXiv Detail & Related papers (2021-02-15T07:10:33Z) - Dynamic Regret of Convex and Smooth Functions [93.71361250701075]
We investigate online convex optimization in non-stationary environments.
We choose the dynamic regret as the performance measure.
We show that it is possible to further enhance the dynamic regret by exploiting the smoothness condition.
arXiv Detail & Related papers (2020-07-07T14:10:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.