Communication-efficient distributed eigenspace estimation
- URL: http://arxiv.org/abs/2009.02436v2
- Date: Fri, 30 Apr 2021 22:43:50 GMT
- Title: Communication-efficient distributed eigenspace estimation
- Authors: Vasileios Charisopoulos, Austin R. Benson, Anil Damle
- Abstract summary: We develop a communication-efficient distributed algorithm for computing the leading invariant subspace of a data matrix.
Our algorithm uses a novel alignment scheme that minimizes the Procrustean distance between local solutions and a reference solution.
We show that our algorithm achieves a similar error rate to that of a centralized estimator.
- Score: 31.69089186688224
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Distributed computing is a standard way to scale up machine learning and data
science algorithms to process large amounts of data. In such settings, avoiding
communication amongst machines is paramount for achieving high performance.
Rather than distribute the computation of existing algorithms, a common
practice for avoiding communication is to compute local solutions or parameter
estimates on each machine and then combine the results; in many convex
optimization problems, even simple averaging of local solutions can work well.
However, these schemes do not work when the local solutions are not unique.
Spectral methods are a collection of such problems, where solutions are
orthonormal bases of the leading invariant subspace of an associated data
matrix, which are only unique up to rotation and reflections. Here, we develop
a communication-efficient distributed algorithm for computing the leading
invariant subspace of a data matrix. Our algorithm uses a novel alignment
scheme that minimizes the Procrustean distance between local solutions and a
reference solution, and only requires a single round of communication. For the
important case of principal component analysis (PCA), we show that our
algorithm achieves a similar error rate to that of a centralized estimator. We
present numerical experiments demonstrating the efficacy of our proposed
algorithm for distributed PCA, as well as other problems where solutions
exhibit rotational symmetry, such as node embeddings for graph data and
spectral initialization for quadratic sensing.
Related papers
- Distributed Linear Regression with Compositional Covariates [5.085889377571319]
We focus on the distributed sparse penalized linear log-contrast model in massive compositional data.
Two distributed optimization techniques are proposed for solving the two different constrained convex optimization problems.
In the decentralized topology, we introduce a distributed coordinate-wise descent algorithm for obtaining a communication-efficient regularized estimation.
arXiv Detail & Related papers (2023-10-21T11:09:37Z) - Fast Optimal Locally Private Mean Estimation via Random Projections [58.603579803010796]
We study the problem of locally private mean estimation of high-dimensional vectors in the Euclidean ball.
We propose a new algorithmic framework, ProjUnit, for private mean estimation.
Our framework is deceptively simple: each randomizer projects its input to a random low-dimensional subspace, normalizes the result, and then runs an optimal algorithm.
arXiv Detail & Related papers (2023-06-07T14:07:35Z) - Linearized Wasserstein dimensionality reduction with approximation
guarantees [65.16758672591365]
LOT Wassmap is a computationally feasible algorithm to uncover low-dimensional structures in the Wasserstein space.
We show that LOT Wassmap attains correct embeddings and that the quality improves with increased sample size.
We also show how LOT Wassmap significantly reduces the computational cost when compared to algorithms that depend on pairwise distance computations.
arXiv Detail & Related papers (2023-02-14T22:12:16Z) - On Accelerating Distributed Convex Optimizations [0.0]
This paper studies a distributed multi-agent convex optimization problem.
We show that the proposed algorithm converges linearly with an improved rate of convergence than the traditional and adaptive gradient-descent methods.
We demonstrate our algorithm's superior performance compared to prominent distributed algorithms for solving real logistic regression problems.
arXiv Detail & Related papers (2021-08-19T13:19:54Z) - Analysis of Truncated Orthogonal Iteration for Sparse Eigenvector
Problems [78.95866278697777]
We propose two variants of the Truncated Orthogonal Iteration to compute multiple leading eigenvectors with sparsity constraints simultaneously.
We then apply our algorithms to solve the sparse principle component analysis problem for a wide range of test datasets.
arXiv Detail & Related papers (2021-03-24T23:11:32Z) - A Linearly Convergent Algorithm for Distributed Principal Component
Analysis [12.91948651812873]
This paper introduces a feedforward neural network-based one time-scale distributed PCA algorithm termed Distributed Sanger's Algorithm (DSA)
The proposed algorithm is shown to converge linearly to a neighborhood of the true solution.
arXiv Detail & Related papers (2021-01-05T00:51:14Z) - Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature
Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization.
We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z) - FedPD: A Federated Learning Framework with Optimal Rates and Adaptivity
to Non-IID Data [59.50904660420082]
Federated Learning (FL) has become a popular paradigm for learning from distributed data.
To effectively utilize data at different devices without moving them to the cloud, algorithms such as the Federated Averaging (FedAvg) have adopted a "computation then aggregation" (CTA) model.
arXiv Detail & Related papers (2020-05-22T23:07:42Z) - Optimization in Machine Learning: A Distribution Space Approach [16.038814087205143]
We present the viewpoint that optimization problems in machine learning can often be interpreted as minimizing a convex functional over a function space.
We repose such problems via a suitable relaxation as convex optimization problems in the space distributions.
We develop a numerical algorithm based on mixture distributions to perform approximate optimization directly in distribution space.
arXiv Detail & Related papers (2020-04-18T13:38:06Z) - Channel Assignment in Uplink Wireless Communication using Machine
Learning Approach [54.012791474906514]
This letter investigates a channel assignment problem in uplink wireless communication systems.
Our goal is to maximize the sum rate of all users subject to integer channel assignment constraints.
Due to high computational complexity, machine learning approaches are employed to obtain computational efficient solutions.
arXiv Detail & Related papers (2020-01-12T15:54:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.