Related papers: Optimal Complexity in Byzantine-Robust Distributed Stochastic Optimization with Data Heterogeneity

Optimal Complexity in Byzantine-Robust Distributed Stochastic Optimization with Data Heterogeneity

URL: http://arxiv.org/abs/2503.16337v1
Date: Thu, 20 Mar 2025 16:56:06 GMT
Title: Optimal Complexity in Byzantine-Robust Distributed Stochastic Optimization with Data Heterogeneity
Authors: Qiankun Shi, Jie Peng, Kun Yuan, Xiao Wang, Qing Ling,
Abstract summary: In this paper, we establish tight lower bounds for Byzantine-robust distributed first-order optimization methods.<n>To fill the gap, we leverage the techniques of Nesterov's acceleration reduction to develop novel Byzantine-robust distributed optimization methods.
Score: 37.14123597310607
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we establish tight lower bounds for Byzantine-robust distributed first-order stochastic optimization methods in both strongly convex and non-convex stochastic optimization. We reveal that when the distributed nodes have heterogeneous data, the convergence error comprises two components: a non-vanishing Byzantine error and a vanishing optimization error. We establish the lower bounds on the Byzantine error and on the minimum number of queries to a stochastic gradient oracle required to achieve an arbitrarily small optimization error. Nevertheless, we identify significant discrepancies between our established lower bounds and the existing upper bounds. To fill this gap, we leverage the techniques of Nesterov's acceleration and variance reduction to develop novel Byzantine-robust distributed stochastic optimization methods that provably match these lower bounds, up to logarithmic factors, implying that our established lower bounds are tight.

Related papers

Zeroth-Order Optimization Finds Flat Minima [51.41529512093436]
We show that zeroth-order optimization with the standard two-point estimator favors solutions with small trace of Hessian.<n>We further provide convergence rates of zeroth-order optimization to approximate flat minima for convex and sufficiently smooth functions.
arXiv Detail & Related papers (2025-06-05T17:59:09Z)
Stochastic Zeroth-Order Optimization under Strongly Convexity and Lipschitz Hessian: Minimax Sample Complexity [59.75300530380427]
We consider the problem of optimizing second-order smooth and strongly convex functions where the algorithm is only accessible to noisy evaluations of the objective function it queries. We provide the first tight characterization for the rate of the minimax simple regret by developing matching upper and lower bounds.
arXiv Detail & Related papers (2024-06-28T02:56:22Z)
Low-Rank Extragradient Methods for Scalable Semidefinite Optimization [17.384717824118255]
We focus on high-dimensional and plausible settings in which the problem admits a low-rank solution. We provide several theoretical results proving that, under these circumstances, the well-known Extragradient method converges to a solution of the constrained optimization problem.
arXiv Detail & Related papers (2024-02-14T10:48:00Z)
An Inexact Halpern Iteration with Application to Distributionally Robust Optimization [9.529117276663431]
We investigate the inexact variants of the scheme in both deterministic and deterministic convergence settings. We show that by choosing the inexactness appropriately, the inexact schemes admit an $O(k-1) convergence rate in terms of the (expected) residue norm.
arXiv Detail & Related papers (2024-02-08T20:12:47Z)
First Order Methods with Markovian Noise: from Acceleration to Variational Inequalities [91.46841922915418]
We present a unified approach for the theoretical analysis of first-order variation methods. Our approach covers both non-linear gradient and strongly Monte Carlo problems. We provide bounds that match the oracle strongly in the case of convex method optimization problems.
arXiv Detail & Related papers (2023-05-25T11:11:31Z)
High-Probability Bounds for Stochastic Optimization and Variational Inequalities: the Case of Unbounded Variance [59.211456992422136]
We propose algorithms with high-probability convergence results under less restrictive assumptions. These results justify the usage of the considered methods for solving problems that do not fit standard functional classes in optimization.
arXiv Detail & Related papers (2023-02-02T10:37:23Z)
Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization. We provide novel excess risk bounds, both in expectation and with high probability. Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z)
Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems. We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z)
Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging. We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z)
Accelerated and instance-optimal policy evaluation with linear function approximation [17.995515643150657]
Existing algorithms fail to match at least one of these lower bounds. We develop an accelerated, variance-reduced fast temporal difference algorithm that simultaneously matches both lower bounds and attains a strong notion of instance-optimality.
arXiv Detail & Related papers (2021-12-24T17:21:04Z)
Nonlinear matrix recovery using optimization on the Grassmann manifold [18.655422834567577]
We investigate the problem of recovering a partially observed high-rank clustering matrix whose columns obey a nonlinear structure such as a union of subspaces. We show that the alternating limit converges to a unique point using the Kurdyka-Lojasi property.
arXiv Detail & Related papers (2021-09-13T16:13:13Z)
Halpern Iteration for Near-Optimal and Parameter-Free Monotone Inclusion and Strong Solutions to Variational Inequalities [14.848525762485872]
We leverage the connections between nonexpansive maps, monotone Lipschitz operators, and proximal mappings to obtain near-optimal solutions to monotone inclusion problems. These results translate into near-optimal guarantees for approximating strong solutions to variational inequality problems, approximating convex-concave min-max optimization problems, and minimizing the norm of the gradient in min-max optimization problems.
arXiv Detail & Related papers (2020-02-20T17:12:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.