Robust Distributed Optimization With Randomly Corrupted Gradients
- URL: http://arxiv.org/abs/2106.14956v1
- Date: Mon, 28 Jun 2021 19:45:25 GMT
- Title: Robust Distributed Optimization With Randomly Corrupted Gradients
- Authors: Berkay Turan, Cesar A. Uribe, Hoi-To Wai, Mahnoosh Alizadeh
- Abstract summary: We propose a first-order distributed optimization algorithm that is provably robust to Byzantine failures-arbitrary and potentially adversarial behavior.
Our algorithm achieves order normalization and trustworthy statistical error convergence rates.
- Score: 24.253191879453784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a first-order distributed optimization algorithm
that is provably robust to Byzantine failures-arbitrary and potentially
adversarial behavior, where all the participating agents are prone to failure.
We model each agent's state over time as a two-state Markov chain that
indicates Byzantine or trustworthy behaviors at different time instants. We set
no restrictions on the maximum number of Byzantine agents at any given time. We
design our method based on three layers of defense: 1) Temporal gradient
averaging, 2) robust aggregation, and 3) gradient normalization. We study two
settings for stochastic optimization, namely Sample Average Approximation and
Stochastic Approximation, and prove that for strongly convex and smooth
non-convex cost functions, our algorithm achieves order-optimal statistical
error and convergence rates.
Related papers
- Stability and Generalization for Stochastic Recursive Momentum-based Algorithms for (Strongly-)Convex One to $K$-Level Stochastic Optimizations [20.809499420384256]
STORM-based algorithms have been widely developed to solve one to $K$-level ($K geq 3$) optimization problems.
This paper provides a comprehensive analysis of three representative STORM-based algorithms.
arXiv Detail & Related papers (2024-07-07T07:07:04Z) - Stochastic-Constrained Stochastic Optimization with Markovian Data [2.1756081703276]
We study the setting where data samples are drawn from a Markov chain and thus are not independent and identically distributed.
We propose two variants of drift-plus-penalty; one is for the case when the mixing time of the underlying Markov chain is known.
Our algorithms apply to a more general setting of constrained online convex optimization where the sequence of constraint functions follows a Markov chain.
arXiv Detail & Related papers (2023-12-07T14:09:27Z) - First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities [91.46841922915418]
We present a unified approach for the theoretical analysis of first-order variation methods.
Our approach covers both non-linear gradient and strongly Monte Carlo problems.
We provide bounds that match the oracle strongly in the case of convex method optimization problems.
arXiv Detail & Related papers (2023-05-25T11:11:31Z) - Two-Timescale Stochastic Approximation for Bilevel Optimisation Problems
in Continuous-Time Models [0.0]
We analyse the properties of a continuous-time, two-timescale approximation algorithm designed for bilevel optimisation problems in continuous-time models.
We obtain the weak convergence rate of this algorithm in the form of a central limit theorem.
arXiv Detail & Related papers (2022-06-14T17:12:28Z) - Accelerating Stochastic Probabilistic Inference [1.599072005190786]
Variational Inference (SVI) has been increasingly attractive thanks to its ability to find good posterior approximations of probabilistic models.
Almost all the state-of-the-art SVI algorithms are based on first-order optimization and often suffer from poor convergence rate.
We bridge the gap between second-order methods and variational inference by proposing a second-order based variational inference approach.
arXiv Detail & Related papers (2022-03-15T01:19:12Z) - A Two-Time-Scale Stochastic Optimization Framework with Applications in Control and Reinforcement Learning [13.908826484332282]
We study a new two-time-scale gradient method for solving optimization problems.
Our first contribution is to characterize the finite-time complexity of the proposed two-time-scale gradient algorithm.
We apply our framework to gradient-based policy evaluation algorithms in reinforcement learning.
arXiv Detail & Related papers (2021-09-29T23:15:23Z) - High Probability Complexity Bounds for Non-Smooth Stochastic Optimization with Heavy-Tailed Noise [51.31435087414348]
It is essential to theoretically guarantee that algorithms provide small objective residual with high probability.
Existing methods for non-smooth convex optimization have complexity bounds with dependence on confidence level.
We propose novel stepsize rules for two methods with gradient clipping.
arXiv Detail & Related papers (2021-06-10T17:54:21Z) - Navigating to the Best Policy in Markov Decision Processes [68.8204255655161]
We investigate the active pure exploration problem in Markov Decision Processes.
Agent sequentially selects actions and, from the resulting system trajectory, aims at the best as fast as possible.
arXiv Detail & Related papers (2021-06-05T09:16:28Z) - Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which.
machines can independently compute gradients, and cooperate.
Our algorithm is based on a new concentration technique, and its sample complexity.
It is very practical: it improves upon the performance of all prior methods when no.
setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z) - Learning from History for Byzantine Robust Optimization [52.68913869776858]
Byzantine robustness has received significant attention recently given its importance for distributed learning.
We show that most existing robust aggregation rules may not converge even in the absence of any Byzantine attackers.
arXiv Detail & Related papers (2020-12-18T16:22:32Z) - Robust, Accurate Stochastic Optimization for Variational Inference [68.83746081733464]
We show that common optimization methods lead to poor variational approximations if the problem is moderately large.
Motivated by these findings, we develop a more robust and accurate optimization framework by viewing the underlying algorithm as producing a Markov chain.
arXiv Detail & Related papers (2020-09-01T19:12:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.