Distributed Saddle-Point Problems Under Similarity
- URL: http://arxiv.org/abs/2107.10706v1
- Date: Thu, 22 Jul 2021 14:25:16 GMT
- Title: Distributed Saddle-Point Problems Under Similarity
- Authors: Aleksandr Beznosikov, Gesualdo Scutari, Alexander Rogozin, Alexander
Gasnikov
- Abstract summary: We show that a given suboptimality $epsilon0$ is achieved master/workers networks in $Omegabig.
We then propose algorithms matching the lower bounds either types of networks (up to log-overs)
We assess effectiveness of the proposed algorithms on a robust logistic regression problem.
- Score: 173.19083235638104
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study solution methods for (strongly-)convex-(strongly)-concave
Saddle-Point Problems (SPPs) over networks of two type - master/workers (thus
centralized) architectures and meshed (thus decentralized) networks. The local
functions at each node are assumed to be similar, due to statistical data
similarity or otherwise. We establish lower complexity bounds for a fairly
general class of algorithms solving the SPP. We show that a given suboptimality
$\epsilon>0$ is achieved over master/workers networks in
$\Omega\big(\Delta\cdot \delta/\mu\cdot \log (1/\varepsilon)\big)$ rounds of
communications, where $\delta>0$ measures the degree of similarity of the local
functions, $\mu$ is their strong convexity constant, and $\Delta$ is the
diameter of the network. The lower communication complexity bound over meshed
networks reads $\Omega\big(1/{\sqrt{\rho}} \cdot {\delta}/{\mu}\cdot\log
(1/\varepsilon)\big)$, where $\rho$ is the (normalized) eigengap of the gossip
matrix used for the communication between neighbouring nodes. We then propose
algorithms matching the lower bounds over either types of networks (up to
log-factors). We assess the effectiveness of the proposed algorithms on a
robust logistic regression problem.
Related papers
- Near-Optimal Distributed Minimax Optimization under the Second-Order Similarity [22.615156512223763]
We propose variance- optimistic sliding (SVOGS) method, which takes the advantage of the finite-sum structure in the objective.
We prove $mathcal O(delta D2/varepsilon)$, communication complexity of $mathcal O(n+sqrtndelta D2/varepsilon)$, and local calls of $tildemathcal O(n+sqrtndelta+L)D2/varepsilon)$.
arXiv Detail & Related papers (2024-05-25T08:34:49Z) - Nearly Optimal Regret for Decentralized Online Convex Optimization [53.433398074919]
Decentralized online convex optimization (D-OCO) aims to minimize a sequence of global loss functions using only local computations and communications.
We develop novel D-OCO algorithms that can respectively reduce the regret bounds for convex and strongly convex functions.
Our algorithms are nearly optimal in terms of $T$, $n$, and $rho$.
arXiv Detail & Related papers (2024-02-14T13:44:16Z) - Most Neural Networks Are Almost Learnable [52.40331776572531]
We show that for any fixed $epsilon>0$ and depth $i$, there is a poly-time algorithm that learns random Xavier networks of depth $i$.
The algorithm runs in time and sample complexity of $(bard)mathrmpoly(epsilon-1)$, where $bar d$ is the size of the network.
For some cases of sigmoid and ReLU-like activations the bound can be improved to $(bard)mathrmpolylog(eps
arXiv Detail & Related papers (2023-05-25T22:27:42Z) - An Optimal Stochastic Algorithm for Decentralized Nonconvex Finite-sum
Optimization [25.21457349137344]
We show a proof to show DEAREST requires at most $mathcal O(+sqrtmnLvarepsilon-2)$ first-order oracle (IFO) calls and $mathcal O(Lvarepsilon-2/sqrt1-lambda_W)$ communication rounds.
arXiv Detail & Related papers (2022-10-25T11:37:11Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - High-Dimensional Inference over Networks: Linear Convergence and
Statistical Guarantees [20.701475313495884]
We study a sparse linear regression over a network of agents, modeled as an undirected graph and no server node.
We analyze the convergence rate and statistical guarantees of a distributed projected gradient tracking-based algorithm.
arXiv Detail & Related papers (2022-01-21T01:26:08Z) - Acceleration in Distributed Optimization Under Similarity [72.54787082152278]
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes.
An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps.
This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
arXiv Detail & Related papers (2021-10-24T04:03:00Z) - Maximizing Influence of Leaders in Social Networks [15.05158252504978]
We consider the edge addition problem for the DeGroot model of opinion dynamics in a social network with $n$ nodes and $m$ edges.
We show that our algorithm is efficient and effective, which scales to large networks with more than a million nodes.
arXiv Detail & Related papers (2021-06-11T02:31:46Z) - Impact of Community Structure on Consensus Machine Learning [0.17188280334580192]
We study consensus machine learning over networks drawn from block models.
We find that there exists a critical level of community structure at which $tau_epsilon$ reaches a lower bound.
arXiv Detail & Related papers (2020-11-02T21:41:35Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.