Zero-Order One-Point Estimate with Distributed Stochastic
Gradient-Tracking Technique
- URL: http://arxiv.org/abs/2210.05618v1
- Date: Tue, 11 Oct 2022 17:04:45 GMT
- Title: Zero-Order One-Point Estimate with Distributed Stochastic
Gradient-Tracking Technique
- Authors: Elissa Mhanna and Mohamad Assaad
- Abstract summary: We consider a distributed multi-agent optimization problem where each agent holds a local objective function that is smooth and convex.
We extend the distributed gradient-tracking method to the bandit setting where we don't have an estimate of the gradient.
We analyze the convergence of this novel technique for smooth and convex objectives using approximation tools.
- Score: 23.63073074337495
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we consider a distributed multi-agent stochastic optimization
problem, where each agent holds a local objective function that is smooth and
convex, and that is subject to a stochastic process. The goal is for all agents
to collaborate to find a common solution that optimizes the sum of these local
functions. With the practical assumption that agents can only obtain noisy
numerical function queries at exactly one point at a time, we extend the
distributed stochastic gradient-tracking method to the bandit setting where we
don't have an estimate of the gradient, and we introduce a zero-order (ZO)
one-point estimate (1P-DSGT). We analyze the convergence of this novel
technique for smooth and convex objectives using stochastic approximation
tools, and we prove that it converges almost surely to the optimum. We then
study the convergence rate for when the objectives are additionally strongly
convex. We obtain a rate of $O(\frac{1}{\sqrt{k}})$ after a sufficient number
of iterations $k > K_2$ which is usually optimal for techniques utilizing
one-point estimators. We also provide a regret bound of $O(\sqrt{k})$, which is
exceptionally good compared to the aforementioned techniques. We further
illustrate the usefulness of the proposed technique using numerical
experiments.
Related papers
- Single Point-Based Distributed Zeroth-Order Optimization with a Non-Convex Stochastic Objective Function [14.986031916712108]
We introduce a zero-order distributed optimization method based on a one-point estimate of the gradient tracking technique.
We prove that this new technique converges with a numerical function at a noisy setting.
arXiv Detail & Related papers (2024-10-08T11:45:45Z) - Adaptive Variance Reduction for Stochastic Optimization under Weaker Assumptions [26.543628010637036]
We introduce a novel adaptive reduction method that achieves an optimal convergence rate of $mathcalO(log T)$ for non- functions.
We also extend the proposed technique to obtain the same optimal rate of $mathcalO(log T)$ for compositional optimization.
arXiv Detail & Related papers (2024-06-04T04:39:51Z) - CEDAS: A Compressed Decentralized Stochastic Gradient Method with Improved Convergence [9.11726703830074]
In this paper, we consider solving the distributed optimization problem under the communication restricted setting.
We show the method over com-pressed exact diffusion termed CEDAS"
In particular, when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when when
arXiv Detail & Related papers (2023-01-14T09:49:15Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - Stochastic Zeroth order Descent with Structured Directions [10.604744518360464]
We introduce and analyze Structured Zeroth order Descent (SSZD), a finite difference approach that approximates a gradient on a set $lleq d directions, where $d is the dimension of the ambient space.
For convex convex we prove almost sure convergence of functions on $O( (d/l) k-c1/2$)$ for every $c1/2$, which is arbitrarily close to the one of the Gradient Descent (SGD) in terms of one number of iterations.
arXiv Detail & Related papers (2022-06-10T14:00:06Z) - Sharper Rates and Flexible Framework for Nonconvex SGD with Client and
Data Sampling [64.31011847952006]
We revisit the problem of finding an approximately stationary point of the average of $n$ smooth and possibly non-color functions.
We generalize the $smallsfcolorgreen$ so that it can provably work with virtually any sampling mechanism.
We provide the most general and most accurate analysis of optimal bound in the smooth non-color regime.
arXiv Detail & Related papers (2022-06-05T21:32:33Z) - STORM+: Fully Adaptive SGD with Momentum for Nonconvex Optimization [74.1615979057429]
We investigate non-batch optimization problems where the objective is an expectation over smooth loss functions.
Our work builds on the STORM algorithm, in conjunction with a novel approach to adaptively set the learning rate and momentum parameters.
arXiv Detail & Related papers (2021-11-01T15:43:36Z) - Momentum Accelerates the Convergence of Stochastic AUPRC Maximization [80.8226518642952]
We study optimization of areas under precision-recall curves (AUPRC), which is widely used for imbalanced tasks.
We develop novel momentum methods with a better iteration of $O (1/epsilon4)$ for finding an $epsilon$stationary solution.
We also design a novel family of adaptive methods with the same complexity of $O (1/epsilon4)$, which enjoy faster convergence in practice.
arXiv Detail & Related papers (2021-07-02T16:21:52Z) - Zeroth-Order Hybrid Gradient Descent: Towards A Principled Black-Box
Optimization Framework [100.36569795440889]
This work is on the iteration of zero-th-order (ZO) optimization which does not require first-order information.
We show that with a graceful design in coordinate importance sampling, the proposed ZO optimization method is efficient both in terms of complexity as well as as function query cost.
arXiv Detail & Related papers (2020-12-21T17:29:58Z) - Accelerate the Warm-up Stage in the Lasso Computation via a Homotopic
Approach [2.538209532048867]
Homotopic method is used to approximate the $ell_1$ penalty that is used in the Lasso-type of estimators.
We prove a faster numerical convergence rate than any existing methods in computing for the Lasso-type of estimators.
arXiv Detail & Related papers (2020-10-26T22:37:49Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.