Subspace Recovery from Heterogeneous Data with Non-isotropic Noise
- URL: http://arxiv.org/abs/2210.13497v1
- Date: Mon, 24 Oct 2022 18:00:08 GMT
- Title: Subspace Recovery from Heterogeneous Data with Non-isotropic Noise
- Authors: John Duchi, Vitaly Feldman, Lunjia Hu, Kunal Talwar
- Abstract summary: We study a basic formulation of this problem: the principal component analysis (PCA)
Our goal is to recover the linear subspace shared by $mu_i$ using the data points from all users.
We design an efficiently-computable estimator under non-spherical and user-dependent noise.
- Score: 43.44371292901258
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recovering linear subspaces from data is a fundamental and important task in
statistics and machine learning. Motivated by heterogeneity in Federated
Learning settings, we study a basic formulation of this problem: the principal
component analysis (PCA), with a focus on dealing with irregular noise. Our
data come from $n$ users with user $i$ contributing data samples from a
$d$-dimensional distribution with mean $\mu_i$. Our goal is to recover the
linear subspace shared by $\mu_1,\ldots,\mu_n$ using the data points from all
users, where every data point from user $i$ is formed by adding an independent
mean-zero noise vector to $\mu_i$. If we only have one data point from every
user, subspace recovery is information-theoretically impossible when the
covariance matrices of the noise vectors can be non-spherical, necessitating
additional restrictive assumptions in previous work. We avoid these assumptions
by leveraging at least two data points from each user, which allows us to
design an efficiently-computable estimator under non-spherical and
user-dependent noise. We prove an upper bound for the estimation error of our
estimator in general scenarios where the number of data points and amount of
noise can vary across users, and prove an information-theoretic error lower
bound that not only matches the upper bound up to a constant factor, but also
holds even for spherical Gaussian noise. This implies that our estimator does
not introduce additional estimation error (up to a constant factor) due to
irregularity in the noise. We show additional results for a linear regression
problem in a similar setup.
Related papers
- A Combinatorial Approach to Robust PCA [18.740048806623037]
We study the problem of recovering Gaussian data under adversarial corruptions.
We assume that the Gaussian noises lie in an unknown $k$-dimensional subspace $U subseteq mathbbRd$, and $s$ randomly chosen coordinates of each data point fall into the control of an adversary.
Our main result is an efficient algorithm that, when $ks2 = O(d)$, recovers every single data point up to a nearly-optimal error of $tilde O(ks/d)$ in expectation.
arXiv Detail & Related papers (2023-11-28T01:49:51Z) - High-probability Convergence Bounds for Nonlinear Stochastic Gradient Descent Under Heavy-tailed Noise [59.25598762373543]
We show that wetailed high-prob convergence guarantees of learning on streaming data in the presence of heavy-tailed noise.
We demonstrate analytically and that $ta$ can be used to the preferred choice of setting for a given problem.
arXiv Detail & Related papers (2023-10-28T18:53:41Z) - Sample-Efficient Linear Representation Learning from Non-IID Non-Isotropic Data [4.971690889257356]
We introduce an adaptation of the alternating minimization-descent scheme proposed by Collins and Nayer and Vaswani.
We show that vanilla alternating-minimization descent fails catastrophically even for iid, but mildly non-isotropic data.
Our analysis unifies and generalizes prior work, and provides a flexible framework for a wider range of applications.
arXiv Detail & Related papers (2023-08-08T17:56:20Z) - General Gaussian Noise Mechanisms and Their Optimality for Unbiased Mean
Estimation [58.03500081540042]
A classical approach to private mean estimation is to compute the true mean and add unbiased, but possibly correlated, Gaussian noise to it.
We show that for every input dataset, an unbiased mean estimator satisfying concentrated differential privacy introduces approximately at least as much error.
arXiv Detail & Related papers (2023-01-31T18:47:42Z) - Optimizing the Noise in Self-Supervised Learning: from Importance
Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs)
We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data.
We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z) - Pitfalls of Gaussians as a noise distribution in NCE [22.23473249312549]
Noise Contrastive Estimation (NCE) is a popular approach for learning probability density functions parameterized up to a constant of proportionality.
We show that the choice of $q$ can severely impact the computational and statistical efficiency of NCE.
arXiv Detail & Related papers (2022-10-01T04:42:56Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Consistent Estimation for PCA and Sparse Regression with Oblivious
Outliers [13.244654316770815]
We develop machinery to design efficiently computable and consistent estimators.
For sparse regression, we achieve consistency for optimal sample size $ngsim (klog d)/alpha2$.
In the context of PCA, we attain optimal error guarantees under broad spikiness assumptions on the parameter matrix.
arXiv Detail & Related papers (2021-11-04T15:59:44Z) - Noise-robust Clustering [2.0199917525888895]
This paper presents noise-robust clustering techniques in unsupervised machine learning.
The uncertainty about the noise, consistency, and other ambiguities can become severe obstacles in data analytics.
arXiv Detail & Related papers (2021-10-17T17:15:13Z) - Understanding the Under-Coverage Bias in Uncertainty Estimation [58.03725169462616]
quantile regression tends to emphunder-cover than the desired coverage level in reality.
We prove that quantile regression suffers from an inherent under-coverage bias.
Our theory reveals that this under-coverage bias stems from a certain high-dimensional parameter estimation error.
arXiv Detail & Related papers (2021-06-10T06:11:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.