One-shot Distributed Algorithm for Generalized Eigenvalue Problem
- URL: http://arxiv.org/abs/2010.11625v2
- Date: Wed, 24 Nov 2021 08:31:35 GMT
- Title: One-shot Distributed Algorithm for Generalized Eigenvalue Problem
- Authors: Kexin Lv, Fan He, Xiaolin Huang, Jie Yang and Liming Chen
- Abstract summary: Generalized eigenvalue problem (GEP) plays a vital role in a large family of high-dimensional statistical models.
Here we propose a general distributed GEP framework with one-shot communication for GEP.
- Score: 23.9525986377055
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nowadays, more and more datasets are stored in a distributed way for the sake
of memory storage or data privacy. The generalized eigenvalue problem (GEP)
plays a vital role in a large family of high-dimensional statistical models.
However, the existing distributed method for eigenvalue decomposition cannot be
applied in GEP for the divergence of the empirical covariance matrix. Here we
propose a general distributed GEP framework with one-shot communication for
GEP. If the symmetric data covariance has repeated eigenvalues, e.g., in
canonical component analysis, we further modify the method for better
convergence. The theoretical analysis on approximation error is conducted and
the relation to the divergence of the data covariance, the eigenvalues of the
empirical data covariance, and the number of local servers is analyzed.
Numerical experiments also show the effectiveness of the proposed algorithms.
Related papers
- Towards a Unified Theory for Semiparametric Data Fusion with Individual-Level Data [1.0650780147044159]
We address the goal of conducting inference about a smooth finite-dimensional parameter by utilizing individual-level data from various independent sources.
Recent advancements have led to the development of a comprehensive theory capable of handling scenarios where different data sources align with, possibly distinct subsets of, conditional distributions of a single factorization of the joint target distribution.
We extend the aforementioned comprehensive theory to allow for the fusion of individual-level data from sources aligned with conditional distributions that do not correspond to a single factorization of the target distribution.
arXiv Detail & Related papers (2024-09-16T04:10:44Z) - A Geometric Unification of Distributionally Robust Covariance Estimators: Shrinking the Spectrum by Inflating the Ambiguity Set [20.166217494056916]
We propose a principled approach to construct covariance estimators without imposing restrictive assumptions.
We show that our robust estimators are efficiently computable and consistent.
Numerical experiments based on synthetic and real data show that our robust estimators are competitive with state-of-the-art estimators.
arXiv Detail & Related papers (2024-05-30T15:01:18Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - Optimal Differentially Private PCA and Estimation for Spiked Covariance Matrices [10.377683220196873]
Estimating a covariance matrix and its associated principal components is a fundamental problem in contemporary statistics.
We study optimal differentially private Principal Component Analysis (PCA) and covariance estimation within the spiked covariance model.
We propose computationally efficient differentially private estimators and prove their minimax optimality for sub-Gaussian distributions.
arXiv Detail & Related papers (2024-01-08T11:18:14Z) - Conformal inference for regression on Riemannian Manifolds [49.7719149179179]
We investigate prediction sets for regression scenarios when the response variable, denoted by $Y$, resides in a manifold, and the covariable, denoted by X, lies in Euclidean space.
We prove the almost sure convergence of the empirical version of these regions on the manifold to their population counterparts.
arXiv Detail & Related papers (2023-10-12T10:56:25Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - Gaussian Graphical Models as an Ensemble Method for Distributed Gaussian
Processes [8.4159776055506]
We propose a novel approach for aggregating the Gaussian experts' predictions by Gaussian graphical model (GGM)
We first estimate the joint distribution of latent and observed variables using the Expectation-Maximization (EM) algorithm.
Our new method outperforms other state-of-the-art DGP approaches.
arXiv Detail & Related papers (2022-02-07T15:22:56Z) - Optimal regularizations for data generation with probabilistic graphical
models [0.0]
Empirically, well-chosen regularization schemes dramatically improve the quality of the inferred models.
We consider the particular case of L 2 and L 1 regularizations in the Maximum A Posteriori (MAP) inference of generative pairwise graphical models.
arXiv Detail & Related papers (2021-12-02T14:45:16Z) - Benign Overfitting of Constant-Stepsize SGD for Linear Regression [122.70478935214128]
inductive biases are central in preventing overfitting empirically.
This work considers this issue in arguably the most basic setting: constant-stepsize SGD for linear regression.
We reflect on a number of notable differences between the algorithmic regularization afforded by (unregularized) SGD in comparison to ordinary least squares.
arXiv Detail & Related papers (2021-03-23T17:15:53Z) - Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets.
Part of the challenge of learning robust models lies in the influence of unobserved confounders.
We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z) - Asymptotic Analysis of an Ensemble of Randomly Projected Linear
Discriminants [94.46276668068327]
In [1], an ensemble of randomly projected linear discriminants is used to classify datasets.
We develop a consistent estimator of the misclassification probability as an alternative to the computationally-costly cross-validation estimator.
We also demonstrate the use of our estimator for tuning the projection dimension on both real and synthetic data.
arXiv Detail & Related papers (2020-04-17T12:47:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.