Zero-Order One-Point Estimate with Distributed Stochastic
Gradient-Tracking Technique
- URL: http://arxiv.org/abs/2210.05618v1
- Date: Tue, 11 Oct 2022 17:04:45 GMT
- Title: Zero-Order One-Point Estimate with Distributed Stochastic
Gradient-Tracking Technique
- Authors: Elissa Mhanna and Mohamad Assaad
- Abstract summary: We consider a distributed multi-agent optimization problem where each agent holds a local objective function that is smooth and convex.
We extend the distributed gradient-tracking method to the bandit setting where we don't have an estimate of the gradient.
We analyze the convergence of this novel technique for smooth and convex objectives using approximation tools.
- Score: 23.63073074337495
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we consider a distributed multi-agent stochastic optimization
problem, where each agent holds a local objective function that is smooth and
convex, and that is subject to a stochastic process. The goal is for all agents
to collaborate to find a common solution that optimizes the sum of these local
functions. With the practical assumption that agents can only obtain noisy
numerical function queries at exactly one point at a time, we extend the
distributed stochastic gradient-tracking method to the bandit setting where we
don't have an estimate of the gradient, and we introduce a zero-order (ZO)
one-point estimate (1P-DSGT). We analyze the convergence of this novel
technique for smooth and convex objectives using stochastic approximation
tools, and we prove that it converges almost surely to the optimum. We then
study the convergence rate for when the objectives are additionally strongly
convex. We obtain a rate of $O(\frac{1}{\sqrt{k}})$ after a sufficient number
of iterations $k > K_2$ which is usually optimal for techniques utilizing
one-point estimators. We also provide a regret bound of $O(\sqrt{k})$, which is
exceptionally good compared to the aforementioned techniques. We further
illustrate the usefulness of the proposed technique using numerical
experiments.
Related papers
- Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions [26.543628010637036]
We introduce a novel adaptive reduction method that achieves an optimal convergence rate of $mathcalO(log T)$ for non- functions.
We also extend the proposed technique to obtain the same optimal rate of $mathcalO(log T)$ for compositional optimization.
arXiv Detail & Related papers (2024-06-04T04:39:51Z) - CEDAS: A Compressed Decentralized Stochastic Gradient Method with
Improved Convergence [10.770843226843418]
In this paper, we consider solving the distributed optimization problem of a multi-agent network under the communication restricted setting.
We show the method compressed exact diffusion termed convexizes (CEDAS)", and show the method achieves adaptive steps for both smooth convex-related steps.
arXiv Detail & Related papers (2023-01-14T09:49:15Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - Stochastic Zeroth order Descent with Structured Directions [14.338969001937068]
We introduce and analyze Structured Zeroth order Descent (S-SZD), a finite difference approach which approximates a gradient on a set $lleq d directions, where $d is the dimension of the ambient space.
For convex we prove almost sure convergence bounds and a convergence rate on every $c1/2$, which is arbitrarily close to the one for Gradient Descent (SGD) in terms of number of iterations.
arXiv Detail & Related papers (2022-06-10T14:00:06Z) - Sharper Rates and Flexible Framework for Nonconvex SGD with Client and
Data Sampling [64.31011847952006]
We revisit the problem of finding an approximately stationary point of the average of $n$ smooth and possibly non-color functions.
We generalize the $smallsfcolorgreen$ so that it can provably work with virtually any sampling mechanism.
We provide the most general and most accurate analysis of optimal bound in the smooth non-color regime.
arXiv Detail & Related papers (2022-06-05T21:32:33Z) - STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization [74.1615979057429]
We investigate non-batch optimization problems where the objective is an expectation over smooth loss functions.
Our work builds on the STORM algorithm, in conjunction with a novel approach to adaptively set the learning rate and momentum parameters.
arXiv Detail & Related papers (2021-11-01T15:43:36Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - Convergence Analysis of Nonconvex Distributed Stochastic Zeroth-order
Coordinate Method [3.860616339202303]
This paper investigates the distributed non optimization problem of minimizing a global cost function formed by the summation of $ZOn$ local cost functions.
Agents approximate their own ZO coordinate method to solve the problem.
arXiv Detail & Related papers (2021-03-24T03:07:46Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Accelerate the Warm-up Stage in the Lasso Computation via a Homotopic
Approach [2.538209532048867]
Homotopic method is used to approximate the $ell_1$ penalty that is used in the Lasso-type of estimators.
We prove a faster numerical convergence rate than any existing methods in computing for the Lasso-type of estimators.
arXiv Detail & Related papers (2020-10-26T22:37:49Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.