Related papers: Distributed Sketching Methods for Privacy Preserving Regression

Distributed Sketching Methods for Privacy Preserving Regression

URL: http://arxiv.org/abs/2002.06538v2
Date: Sat, 20 Jun 2020 00:36:01 GMT
Title: Distributed Sketching Methods for Privacy Preserving Regression
Authors: Burak Bartan, Mert Pilanci
Abstract summary: We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems. We derive novel approximation guarantees for classical sketching methods and analyze the accuracy of parameter averaging for distributed sketches. We illustrate the performance of distributed sketches in a serverless computing platform with large scale experiments.
Score: 54.51566432934556
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we study distributed sketching methods for large scale regression problems. We leverage multiple randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems. We derive novel approximation guarantees for classical sketching methods and analyze the accuracy of parameter averaging for distributed sketches. We consider random matrices including Gaussian, randomized Hadamard, uniform sampling and leverage score sampling in the distributed setting. Moreover, we propose a hybrid approach combining sampling and fast random projections for better computational efficiency. We illustrate the performance of distributed sketches in a serverless computing platform with large scale experiments.

Related papers

Distributed Markov Chain Monte Carlo Sampling based on the Alternating Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers. We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art. In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z)
Gradient Coding with Iterative Block Leverage Score Sampling [42.21200677508463]
We generalize the leverage score sampling sketch for $ell$-subspace embeddings, to accommodate sampling subsets of the transformed data. This is then used to derive an approximate coded computing approach for first-order methods.
arXiv Detail & Related papers (2023-08-06T12:22:12Z)
Langevin Monte Carlo for Contextual Bandits [72.00524614312002]
Langevin Monte Carlo Thompson Sampling (LMC-TS) is proposed to directly sample from the posterior distribution in contextual bandits. We prove that the proposed algorithm achieves the same sublinear regret bound as the best Thompson sampling algorithms for a special case of contextual bandits.
arXiv Detail & Related papers (2022-06-22T17:58:23Z)
Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging. We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z)
Accumulations of Projections--A Unified Framework for Random Sketches in Kernel Ridge Regression [12.258887270632869]
Building a sketch of an n-by-n empirical kernel matrix is a common approach to accelerate the computation of many kernel methods. We propose a unified framework of constructing sketching methods in kernel ridge regression.
arXiv Detail & Related papers (2021-03-06T05:02:17Z)
Stochastic Saddle-Point Optimization for Wasserstein Barycenters [69.68068088508505]
We consider the populationimation barycenter problem for random probability measures supported on a finite set of points and generated by an online stream of data. We employ the structure of the problem and obtain a convex-concave saddle-point reformulation of this problem. In the setting when the distribution of random probability measures is discrete, we propose an optimization algorithm and estimate its complexity.
arXiv Detail & Related papers (2020-06-11T19:40:38Z)
Fitting Laplacian Regularized Stratified Gaussian Models [0.0]
We consider the problem of jointly estimating multiple related zero-mean Gaussian distributions from data. We propose a distributed method that scales to large problems, and illustrate the efficacy of the method with examples in finance, radar signal processing, and weather forecasting.
arXiv Detail & Related papers (2020-05-04T18:00:59Z)
Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck. We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian. We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.