Distributed Sketching Methods for Privacy Preserving Regression
- URL: http://arxiv.org/abs/2002.06538v2
- Date: Sat, 20 Jun 2020 00:36:01 GMT
- Title: Distributed Sketching Methods for Privacy Preserving Regression
- Authors: Burak Bartan, Mert Pilanci
- Abstract summary: We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
We derive novel approximation guarantees for classical sketching methods and analyze the accuracy of parameter averaging for distributed sketches.
We illustrate the performance of distributed sketches in a serverless computing platform with large scale experiments.
- Score: 54.51566432934556
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we study distributed sketching methods for large scale
regression problems. We leverage multiple randomized sketches for reducing the
problem dimensions as well as preserving privacy and improving straggler
resilience in asynchronous distributed systems. We derive novel approximation
guarantees for classical sketching methods and analyze the accuracy of
parameter averaging for distributed sketches. We consider random matrices
including Gaussian, randomized Hadamard, uniform sampling and leverage score
sampling in the distributed setting. Moreover, we propose a hybrid approach
combining sampling and fast random projections for better computational
efficiency. We illustrate the performance of distributed sketches in a
serverless computing platform with large scale experiments.
Related papers
- Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Gradient Coding with Iterative Block Leverage Score Sampling [42.21200677508463]
We generalize the leverage score sampling sketch for $ell$-subspace embeddings, to accommodate sampling subsets of the transformed data.
This is then used to derive an approximate coded computing approach for first-order methods.
arXiv Detail & Related papers (2023-08-06T12:22:12Z) - Langevin Monte Carlo for Contextual Bandits [72.00524614312002]
Langevin Monte Carlo Thompson Sampling (LMC-TS) is proposed to directly sample from the posterior distribution in contextual bandits.
We prove that the proposed algorithm achieves the same sublinear regret bound as the best Thompson sampling algorithms for a special case of contextual bandits.
arXiv Detail & Related papers (2022-06-22T17:58:23Z) - Distributed Sketching for Randomized Optimization: Exact
Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging.
We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z) - Accumulations of Projections--A Unified Framework for Random Sketches in
Kernel Ridge Regression [12.258887270632869]
Building a sketch of an n-by-n empirical kernel matrix is a common approach to accelerate the computation of many kernel methods.
We propose a unified framework of constructing sketching methods in kernel ridge regression.
arXiv Detail & Related papers (2021-03-06T05:02:17Z) - Stochastic Saddle-Point Optimization for Wasserstein Barycenters [69.68068088508505]
We consider the populationimation barycenter problem for random probability measures supported on a finite set of points and generated by an online stream of data.
We employ the structure of the problem and obtain a convex-concave saddle-point reformulation of this problem.
In the setting when the distribution of random probability measures is discrete, we propose an optimization algorithm and estimate its complexity.
arXiv Detail & Related papers (2020-06-11T19:40:38Z) - Fitting Laplacian Regularized Stratified Gaussian Models [0.0]
We consider the problem of jointly estimating multiple related zero-mean Gaussian distributions from data.
We propose a distributed method that scales to large problems, and illustrate the efficacy of the method with examples in finance, radar signal processing, and weather forecasting.
arXiv Detail & Related papers (2020-05-04T18:00:59Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.