Distributed Sparse Linear Regression under Communication Constraints
- URL: http://arxiv.org/abs/2301.04022v1
- Date: Mon, 9 Jan 2023 08:23:37 GMT
- Title: Distributed Sparse Linear Regression under Communication Constraints
- Authors: Rodney Fonseca and Boaz Nadler
- Abstract summary: We focus on distributed learning of a sparse linear regression model, under severe communication constraints.
In our schemes, individual machines compute debiased lasso estimators, but send to the fusion center only very few values.
We show in simulations that our scheme works as well as, and in some cases better, than more communication intensive approaches.
- Score: 4.997371967242497
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In multiple domains, statistical tasks are performed in distributed settings,
with data split among several end machines that are connected to a fusion
center. In various applications, the end machines have limited bandwidth and
power, and thus a tight communication budget. In this work we focus on
distributed learning of a sparse linear regression model, under severe
communication constraints. We propose several two round distributed schemes,
whose communication per machine is sublinear in the data dimension. In our
schemes, individual machines compute debiased lasso estimators, but send to the
fusion center only very few values. On the theoretical front, we analyze one of
these schemes and prove that with high probability it achieves exact support
recovery at low signal to noise ratios, where individual machines fail to
recover the support. We show in simulations that our scheme works as well as,
and in some cases better, than more communication intensive approaches.
Related papers
- Perceiver-based CDF Modeling for Time Series Forecasting [25.26713741799865]
We propose a new architecture, called perceiver-CDF, for modeling cumulative distribution functions (CDF) of time series data.
Our approach combines the perceiver architecture with a copula-based attention mechanism tailored for multimodal time series prediction.
Experiments on the unimodal and multimodal benchmarks consistently demonstrate a 20% improvement over state-of-the-art methods.
arXiv Detail & Related papers (2023-10-03T01:13:17Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Federated Empirical Risk Minimization via Second-Order Method [18.548661105227488]
We present an interior point method (IPM) to solve a general empirical risk minimization problem under the federated learning setting.
We show that the communication complexity of each iteration of our IPM is $tildeO(d3/2)$, where $d$ is the dimension (i.e., number of features) of the dataset.
arXiv Detail & Related papers (2023-05-27T14:23:14Z) - On Optimizing the Communication of Model Parallelism [74.15423270435949]
We study a novel and important communication pattern in large-scale model-parallel deep learning (DL)
In cross-mesh resharding, a sharded tensor needs to be sent from a source device mesh to a destination device mesh.
We propose two contributions to address cross-mesh resharding: an efficient broadcast-based communication system, and an "overlapping-friendly" pipeline schedule.
arXiv Detail & Related papers (2022-11-10T03:56:48Z) - FeDXL: Provable Federated Learning for Deep X-Risk Optimization [105.17383135458897]
We tackle a novel federated learning (FL) problem for optimizing a family of X-risks, to which no existing algorithms are applicable.
The challenges for designing an FL algorithm for X-risks lie in the non-decomability of the objective over multiple machines and the interdependency between different machines.
arXiv Detail & Related papers (2022-10-26T00:23:36Z) - Recovery Guarantees for Distributed-OMP [8.393317912360564]
We study distributed schemes for high-dimensional sparse linear regression.
We prove that distributed-OMP schemes recover the support of the regression vector with communication per machine linear in its sparsity and logarithmic in the dimension.
Our simulations show that distributed-OMP schemes are competitive with more computationally intensive methods, and in some cases even outperform them.
arXiv Detail & Related papers (2022-09-15T11:43:33Z) - Distributed Nonparametric Estimation under Communication Constraints [0.0]
We provide a general framework for understanding the behavior of distributed estimation under communication constraints.
We derive minimax lower and matching upper bounds in the distributed regression, density estimation, classification, Poisson regression and volatility estimation models.
arXiv Detail & Related papers (2022-04-21T19:04:50Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - A Linearly Convergent Algorithm for Decentralized Optimization: Sending
Less Bits for Free! [72.31332210635524]
Decentralized optimization methods enable on-device training of machine learning models without a central coordinator.
We propose a new randomized first-order method which tackles the communication bottleneck by applying randomized compression operators.
We prove that our method can solve the problems without any increase in the number of communications compared to the baseline.
arXiv Detail & Related papers (2020-11-03T13:35:53Z) - Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge
Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles.
Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center.
We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes.
A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z) - Scaling-up Distributed Processing of Data Streams for Machine Learning [10.581140430698103]
This paper reviews recently developed methods that focus on large-scale distributed optimization in the compute- and bandwidth-limited regime.
It focuses on methods that solve: (i) distributed convex problems, and (ii) distributed principal component analysis, which is a non problem with geometric structure that permits global convergence.
arXiv Detail & Related papers (2020-05-18T16:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.