Distributed Sketching for Randomized Optimization: Exact
Characterization, Concentration and Lower Bounds
- URL: http://arxiv.org/abs/2203.09755v1
- Date: Fri, 18 Mar 2022 05:49:13 GMT
- Title: Distributed Sketching for Randomized Optimization: Exact
Characterization, Concentration and Lower Bounds
- Authors: Burak Bartan, Mert Pilanci
- Abstract summary: We consider distributed optimization methods for problems where forming the Hessian is computationally challenging.
We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
- Score: 54.51566432934556
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider distributed optimization methods for problems where forming the
Hessian is computationally challenging and communication is a significant
bottleneck. We leverage randomized sketches for reducing the problem dimensions
as well as preserving privacy and improving straggler resilience in
asynchronous distributed systems. We derive novel approximation guarantees for
classical sketching methods and establish tight concentration results that
serve as both upper and lower bounds on the error. We then extend our analysis
to the accuracy of parameter averaging for distributed sketches. Furthermore,
we develop unbiased parameter averaging methods for randomized second order
optimization for regularized problems that employ sketching of the Hessian.
Existing works do not take the bias of the estimators into consideration, which
limits their application to massively parallel computation. We provide
closed-form formulas for regularization parameters and step sizes that provably
minimize the bias for sketched Newton directions. Additionally, we demonstrate
the implications of our theoretical findings via large scale experiments on a
serverless cloud computing platform.
Related papers
- Newton Meets Marchenko-Pastur: Massively Parallel Second-Order Optimization with Hessian Sketching and Debiasing [45.475515050909706]
We consider the problem of minimizing a convex function in a massively parallel fashion, where communication between workers is limited.
We propose a scheme where the central node (server) effectively runs a Newton method, offloading its high per-iteration cost.
In our solution, workers produce independently coarse but low-bias estimates of the inverse Hessian, using an adaptive sketching scheme.
arXiv Detail & Related papers (2024-10-02T09:38:04Z) - Wasserstein Distributionally Robust Estimation in High Dimensions:
Performance Analysis and Optimal Hyperparameter Tuning [0.0]
We propose a Wasserstein distributionally robust estimation framework to estimate an unknown parameter from noisy linear measurements.
We focus on the task of analyzing the squared error performance of such estimators.
We show that the squared error can be recovered as the solution of a convex-concave optimization problem.
arXiv Detail & Related papers (2022-06-27T13:02:59Z) - Continuous Wasserstein-2 Barycenter Estimation without Minimax
Optimization [94.18714844247766]
Wasserstein barycenters provide a geometric notion of the weighted average of probability measures based on optimal transport.
We present a scalable algorithm to compute Wasserstein-2 barycenters given sample access to the input measures.
arXiv Detail & Related papers (2021-02-02T21:01:13Z) - Accelerated, Optimal, and Parallel: Some Results on Model-Based
Stochastic Optimization [33.71051480619541]
We extend the Approximate-Proximal Point (aProx) family of model-based methods for solving convex optimization problems.
We provide non-asymptotic convergence guarantees and an acceleration scheme for which we provide linear speedup in minibatch size.
We show improved convergence rates and matching lower bounds identifying new fundamental constants for "interpolation" problems.
arXiv Detail & Related papers (2021-01-07T18:58:39Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z) - Distributed Averaging Methods for Randomized Second Order Optimization [54.51566432934556]
We consider distributed optimization problems where forming the Hessian is computationally challenging and communication is a bottleneck.
We develop unbiased parameter averaging methods for randomized second order optimization that employ sampling and sketching of the Hessian.
We also extend the framework of second order averaging methods to introduce an unbiased distributed optimization framework for heterogeneous computing systems.
arXiv Detail & Related papers (2020-02-16T09:01:18Z) - Distributed Sketching Methods for Privacy Preserving Regression [54.51566432934556]
We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
We derive novel approximation guarantees for classical sketching methods and analyze the accuracy of parameter averaging for distributed sketches.
We illustrate the performance of distributed sketches in a serverless computing platform with large scale experiments.
arXiv Detail & Related papers (2020-02-16T08:35:48Z) - Support recovery and sup-norm convergence rates for sparse pivotal
estimation [79.13844065776928]
In high dimensional sparse regression, pivotal estimators are estimators for which the optimal regularization parameter is independent of the noise level.
We show minimax sup-norm convergence rates for non smoothed and smoothed, single task and multitask square-root Lasso-type estimators.
arXiv Detail & Related papers (2020-01-15T16:11:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.