Estimating Rank-One Spikes from Heavy-Tailed Noise via Self-Avoiding
Walks
- URL: http://arxiv.org/abs/2008.13735v1
- Date: Mon, 31 Aug 2020 16:57:20 GMT
- Title: Estimating Rank-One Spikes from Heavy-Tailed Noise via Self-Avoiding
Walks
- Authors: Jingqiu Ding, Samuel B.Hopkins, David Steurer
- Abstract summary: We study symmetric spiked matrix models with respect to a general class of noise distributions.
We exhibit an estimator that works for heavy-tailed noise up to the BBP threshold that is optimal even for Gaussian noise.
Our estimator can be evaluated in time by counting self-avoiding walks via a color-coding technique.
- Score: 13.879536370173506
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study symmetric spiked matrix models with respect to a general class of
noise distributions. Given a rank-1 deformation of a random noise matrix, whose
entries are independently distributed with zero mean and unit variance, the
goal is to estimate the rank-1 part. For the case of Gaussian noise, the top
eigenvector of the given matrix is a widely-studied estimator known to achieve
optimal statistical guarantees, e.g., in the sense of the celebrated BBP phase
transition. However, this estimator can fail completely for heavy-tailed noise.
In this work, we exhibit an estimator that works for heavy-tailed noise up to
the BBP threshold that is optimal even for Gaussian noise. We give a
non-asymptotic analysis of our estimator which relies only on the variance of
each entry remaining constant as the size of the matrix grows: higher moments
may grow arbitrarily fast or even fail to exist. Previously, it was only known
how to achieve these guarantees if higher-order moments of the noises are
bounded by a constant independent of the size of the matrix. Our estimator can
be evaluated in polynomial time by counting self-avoiding walks via a color
-coding technique. Moreover, we extend our estimator to spiked tensor models
and establish analogous results.
Related papers
- Relaxed Quantile Regression: Prediction Intervals for Asymmetric Noise [51.87307904567702]
Quantile regression is a leading approach for obtaining such intervals via the empirical estimation of quantiles in the distribution of outputs.
We propose Relaxed Quantile Regression (RQR), a direct alternative to quantile regression based interval construction that removes this arbitrary constraint.
We demonstrate that this added flexibility results in intervals with an improvement in desirable qualities.
arXiv Detail & Related papers (2024-06-05T13:36:38Z) - General Gaussian Noise Mechanisms and Their Optimality for Unbiased Mean
Estimation [58.03500081540042]
A classical approach to private mean estimation is to compute the true mean and add unbiased, but possibly correlated, Gaussian noise to it.
We show that for every input dataset, an unbiased mean estimator satisfying concentrated differential privacy introduces approximately at least as much error.
arXiv Detail & Related papers (2023-01-31T18:47:42Z) - Optimizing the Noise in Self-Supervised Learning: from Importance
Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs)
We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data.
We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z) - Detection problems in the spiked matrix models [15.125686694430573]
We first show that the principal component analysis can be improved by entrywise pre-transforming the data matrix if the noise is non-Gaussian.
We also introduce an algorithm that estimates the rank of the signal when it is not known a priori.
arXiv Detail & Related papers (2023-01-12T23:46:41Z) - Robust Inference of Manifold Density and Geometry by Doubly Stochastic
Scaling [8.271859911016719]
We develop tools for robust inference under high-dimensional noise.
We show that our approach is robust to variability in technical noise levels across cell types.
arXiv Detail & Related papers (2022-09-16T15:39:11Z) - Robust Matrix Completion with Heavy-tailed Noise [0.5837881923712392]
This paper studies low-rank matrix completion in the presence of heavy-tailed possibly asymmetric noise.
In this paper, we adopt adaptive Huber loss accommodate heavy-tailed noise, which is robust against large and possibly asymmetric errors.
We prove that under merely a second moment condition on the error, the Euclidean error falls geometrically fast until achieving a minimax-optimal statistical estimation error.
arXiv Detail & Related papers (2022-06-09T04:48:48Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - A Random Matrix Perspective on Random Tensors [40.89521598604993]
We study the spectra of random matrices arising from contractions of a given random tensor.
Our technique yields a hitherto unknown characterization of the local maximum of the ML problem.
Our approach is versatile and can be extended to other models, such as asymmetric, non-Gaussian and higher-order ones.
arXiv Detail & Related papers (2021-08-02T10:42:22Z) - On Estimating Rank-One Spiked Tensors in the Presence of Heavy Tailed
Errors [10.166435679260015]
We study the estimation of a rank-one spiked tensor in the presence of heavy tailed noise.
Our results highlight some of the fundamental similarities and differences in the tradeoff between statistical and computational efficiencies.
arXiv Detail & Related papers (2021-07-20T17:45:55Z) - Learning Noise Transition Matrix from Only Noisy Labels via Total
Variation Regularization [88.91872713134342]
We propose a theoretically grounded method that can estimate the noise transition matrix and learn a classifier simultaneously.
We show the effectiveness of the proposed method through experiments on benchmark and real-world datasets.
arXiv Detail & Related papers (2021-02-04T05:09:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.