Accelerating Distributed SGD for Linear Regression using Iterative
Pre-Conditioning
- URL: http://arxiv.org/abs/2011.07595v2
- Date: Sat, 28 Nov 2020 08:05:23 GMT
- Title: Accelerating Distributed SGD for Linear Regression using Iterative
Pre-Conditioning
- Authors: Kushal Chakrabarti, Nirupam Gupta and Nikhil Chopra
- Abstract summary: Iteratively Pre-conditioned Gradient-descent (IPSG) method shown to converge faster than other existing distributed algorithms.
IPSG method's convergence rate compares favorably to prominent algorithms for solving the linear least-squares problem in server-based networks.
- Score: 0.966840768820136
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper considers the multi-agent distributed linear least-squares
problem. The system comprises multiple agents, each agent with a locally
observed set of data points, and a common server with whom the agents can
interact. The agents' goal is to compute a linear model that best fits the
collective data points observed by all the agents. In the server-based
distributed settings, the server cannot access the data points held by the
agents. The recently proposed Iteratively Pre-conditioned Gradient-descent
(IPG) method has been shown to converge faster than other existing distributed
algorithms that solve this problem. In the IPG algorithm, the server and the
agents perform numerous iterative computations. Each of these iterations relies
on the entire batch of data points observed by the agents for updating the
current estimate of the solution. Here, we extend the idea of iterative
pre-conditioning to the stochastic settings, where the server updates the
estimate and the iterative pre-conditioning matrix based on a single randomly
selected data point at every iteration. We show that our proposed Iteratively
Pre-conditioned Stochastic Gradient-descent (IPSG) method converges linearly in
expectation to a proximity of the solution. Importantly, we empirically show
that the proposed IPSG method's convergence rate compares favorably to
prominent stochastic algorithms for solving the linear least-squares problem in
server-based networks.
Related papers
- DASA: Delay-Adaptive Multi-Agent Stochastic Approximation [64.32538247395627]
We consider a setting in which $N$ agents aim to speedup a common Approximation problem by acting in parallel and communicating with a central server.
To mitigate the effect of delays and stragglers, we propose textttDASA, a Delay-Adaptive algorithm for multi-agent Approximation.
arXiv Detail & Related papers (2024-03-25T22:49:56Z) - Distributed Bayesian Estimation in Sensor Networks: Consensus on
Marginal Densities [15.038649101409804]
We derive a distributed provably-correct algorithm in the functional space of probability distributions over continuous variables.
We leverage these results to obtain new distributed estimators restricted to subsets of variables observed by individual agents.
This relates to applications such as cooperative localization and federated learning, where the data collected at any agent depends on a subset of all variables of interest.
arXiv Detail & Related papers (2023-12-02T21:10:06Z) - Iterative Sketching for Secure Coded Regression [66.53950020718021]
We propose methods for speeding up distributed linear regression.
Specifically, we randomly rotate the basis of the system of equations and then subsample blocks, to simultaneously secure the information and reduce the dimension of the regression problem.
arXiv Detail & Related papers (2023-08-08T11:10:42Z) - Federated Learning Using Variance Reduced Stochastic Gradient for
Probabilistically Activated Agents [0.0]
This paper proposes an algorithm for Federated Learning (FL) with a two-layer structure that achieves both variance reduction and a faster convergence rate to an optimal solution in the setting where each agent has an arbitrary probability of selection in each iteration.
arXiv Detail & Related papers (2022-10-25T22:04:49Z) - Optimal Clustering with Bandit Feedback [57.672609011609886]
This paper considers the problem of online clustering with bandit feedback.
It includes a novel stopping rule for sequential testing that circumvents the need to solve any NP-hard weighted clustering problem as its subroutines.
We show through extensive simulations on synthetic and real-world datasets that BOC's performance matches the lower boundally, and significantly outperforms a non-adaptive baseline algorithm.
arXiv Detail & Related papers (2022-02-09T06:05:05Z) - On Accelerating Distributed Convex Optimizations [0.0]
This paper studies a distributed multi-agent convex optimization problem.
We show that the proposed algorithm converges linearly with an improved rate of convergence than the traditional and adaptive gradient-descent methods.
We demonstrate our algorithm's superior performance compared to prominent distributed algorithms for solving real logistic regression problems.
arXiv Detail & Related papers (2021-08-19T13:19:54Z) - Robustness of Iteratively Pre-Conditioned Gradient-Descent Method: The
Case of Distributed Linear Regression Problem [0.0]
In noise-free systems, the recently proposed distributed linear regression algorithm, named the Iteratively Pre-conditioned Gradient-descent (IPG) method, has been claimed to converge faster than related methods.
We empirically show that the robustness of the IPG method compares favorably to the state-of-the-art algorithms.
arXiv Detail & Related papers (2021-01-26T17:51:49Z) - Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge
Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles.
Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center.
We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes.
A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z) - Iterative Pre-Conditioning for Expediting the Gradient-Descent Method:
The Distributed Linear Least-Squares Problem [0.966840768820136]
This paper considers the multi-agent linear least-squares problem in a server-agent network.
The goal for the agents is to compute a linear model that optimally fits the collective data points held by all the agents, without sharing their individual local data points.
We propose an iterative pre-conditioning technique that mitigates the deleterious effect of the conditioning of data points on the rate of convergence of the gradient-descent method.
arXiv Detail & Related papers (2020-08-06T20:01:18Z) - Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets.
Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z) - A Multi-Agent Primal-Dual Strategy for Composite Optimization over
Distributed Features [52.856801164425086]
We study multi-agent sharing optimization problems with the objective function being the sum of smooth local functions plus a convex (possibly non-smooth) coupling function.
arXiv Detail & Related papers (2020-06-15T19:40:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.