Distributed Nonparametric Estimation under Communication Constraints
- URL: http://arxiv.org/abs/2204.10373v1
- Date: Thu, 21 Apr 2022 19:04:50 GMT
- Title: Distributed Nonparametric Estimation under Communication Constraints
- Authors: Azeem Zaman and Botond Szab\'o
- Abstract summary: We provide a general framework for understanding the behavior of distributed estimation under communication constraints.
We derive minimax lower and matching upper bounds in the distributed regression, density estimation, classification, Poisson regression and volatility estimation models.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the era of big data, it is necessary to split extremely large data sets
across multiple computing nodes and construct estimators using the distributed
data. When designing distributed estimators, it is desirable to minimize the
amount of communication across the network because transmission between
computers is slow in comparison to computations in a single computer. Our work
provides a general framework for understanding the behavior of distributed
estimation under communication constraints for nonparametric problems. We
provide results for a broad class of models, moving beyond the Gaussian
framework that dominates the literature. As concrete examples we derive minimax
lower and matching upper bounds in the distributed regression, density
estimation, classification, Poisson regression and volatility estimation models
under communication constraints. To assist with this, we provide sufficient
conditions that can be easily verified in all of our examples.
Related papers
- Distributed Learning of Mixtures of Experts [0.0]
We deal with datasets that are either distributed by nature or potentially large for which distributing the computations is usually a standard way to proceed.
We propose a distributed learning approach for mixtures of experts (MoE) models with an aggregation strategy to construct a reduction estimator from local estimators fitted parallelly to distributed subsets of the data.
arXiv Detail & Related papers (2023-12-15T15:26:13Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Score Approximation, Estimation and Distribution Recovery of Diffusion
Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace.
We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated.
The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z) - Distributed Sparse Linear Regression under Communication Constraints [4.997371967242497]
We focus on distributed learning of a sparse linear regression model, under severe communication constraints.
In our schemes, individual machines compute debiased lasso estimators, but send to the fusion center only very few values.
We show in simulations that our scheme works as well as, and in some cases better, than more communication intensive approaches.
arXiv Detail & Related papers (2023-01-09T08:23:37Z) - Quantization for decentralized learning under subspace constraints [61.59416703323886]
We consider decentralized optimization problems where agents have individual cost functions to minimize subject to subspace constraints.
We propose and study an adaptive decentralized strategy where the agents employ differential randomized quantizers to compress their estimates.
The analysis shows that, under some general conditions on the quantization noise, the strategy is stable both in terms of mean-square error and average bit rate.
arXiv Detail & Related papers (2022-09-16T09:38:38Z) - Fundamental Limits of Communication Efficiency for Model Aggregation in
Distributed Learning: A Rate-Distortion Approach [54.311495894129585]
We study the limit of communication cost of model aggregation in distributed learning from a rate-distortion perspective.
It is found that the communication gain by exploiting the correlation between worker nodes is significant for SignSGD.
arXiv Detail & Related papers (2022-06-28T13:10:40Z) - Acceleration in Distributed Optimization Under Similarity [72.54787082152278]
We study distributed (strongly convex) optimization problems over a network of agents, with no centralized nodes.
An $varepsilon$-solution is achieved in $tildemathcalrhoObig(sqrtfracbeta/mu (1-)log1/varepsilonbig)$ number of communications steps.
This rate matches (up to poly-log factors) for the first time lower complexity communication bounds of distributed gossip-algorithms applied to the class of problems of interest.
arXiv Detail & Related papers (2021-10-24T04:03:00Z) - SignalNet: A Low Resolution Sinusoid Decomposition and Estimation
Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples.
We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions.
In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z) - Scaling-up Distributed Processing of Data Streams for Machine Learning [10.581140430698103]
This paper reviews recently developed methods that focus on large-scale distributed optimization in the compute- and bandwidth-limited regime.
It focuses on methods that solve: (i) distributed convex problems, and (ii) distributed principal component analysis, which is a non problem with geometric structure that permits global convergence.
arXiv Detail & Related papers (2020-05-18T16:28:54Z) - A General Pairwise Comparison Model for Extremely Sparse Networks [5.298287413134346]
We show that the maximum likelihood estimator for the latent score vector of the subjects is uniformly consistent under a near-minimal condition on network sparsity.
Our results guarantee that the maximum likelihood estimator is justified for estimation in large-scale pairwise comparison networks.
arXiv Detail & Related papers (2020-02-20T16:39:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.