The price of ignorance: how much does it cost to forget noise structure
in low-rank matrix estimation?
- URL: http://arxiv.org/abs/2205.10009v1
- Date: Fri, 20 May 2022 07:54:21 GMT
- Title: The price of ignorance: how much does it cost to forget noise structure
in low-rank matrix estimation?
- Authors: Jean Barbier, TianQi Hou, Marco Mondelli and Manuel S\'aenz
- Abstract summary: We consider the problem of estimating a rank-1 signal corrupted by structured rotationally invariant noise.
We make a step towards understanding the effect of the strong source of mismatch which is the noise statistics.
We show that this performance gap is due to an incorrect estimation of the signal norm.
- Score: 21.3083877172595
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of estimating a rank-1 signal corrupted by structured
rotationally invariant noise, and address the following question: how well do
inference algorithms perform when the noise statistics is unknown and hence
Gaussian noise is assumed? While the matched Bayes-optimal setting with
unstructured noise is well understood, the analysis of this mismatched problem
is only at its premises. In this paper, we make a step towards understanding
the effect of the strong source of mismatch which is the noise statistics. Our
main technical contribution is the rigorous analysis of a Bayes estimator and
of an approximate message passing (AMP) algorithm, both of which incorrectly
assume a Gaussian setup. The first result exploits the theory of spherical
integrals and of low-rank matrix perturbations; the idea behind the second one
is to design and analyze an artificial AMP which, by taking advantage of the
flexibility in the denoisers, is able to "correct" the mismatch. Armed with
these sharp asymptotic characterizations, we unveil a rich and often unexpected
phenomenology. For example, despite AMP is in principle designed to efficiently
compute the Bayes estimator, the former is outperformed by the latter in terms
of mean-square error. We show that this performance gap is due to an incorrect
estimation of the signal norm. In fact, when the SNR is large enough, the
overlaps of the AMP and the Bayes estimator coincide, and they even match those
of optimal estimators taking into account the structure of the noise.
Related papers
- Unrolled denoising networks provably learn optimal Bayesian inference [54.79172096306631]
We prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP)
For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network converge to the same denoisers used in Bayes AMP.
arXiv Detail & Related papers (2024-09-19T17:56:16Z) - Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise [19.496063739638924]
We consider a saturate problem of Bayesian inference for a structured spiked model.
We show how to predict the statistical limits using an efficient algorithm inspired by the theory of adaptive Thouless-Anderson-Palmer equations.
arXiv Detail & Related papers (2024-05-31T16:38:35Z) - Mismatched estimation of non-symmetric rank-one matrices corrupted by
structured noise [21.447998004700384]
We study the performance of a Bayesian statistician who estimates a rank-one signal corrupted by non-symmetric rotationally invariant noise with a generic distribution of singular values.
We derive the exact analytic expression for the error of the mismatched Bayes estimator and also provide the analysis of an approximate message passing (AMP) algorithm.
arXiv Detail & Related papers (2023-02-07T07:56:19Z) - Optimizing the Noise in Self-Supervised Learning: from Importance
Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs)
We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data.
We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z) - Robust Matrix Completion with Heavy-tailed Noise [0.5837881923712392]
This paper studies low-rank matrix completion in the presence of heavy-tailed possibly asymmetric noise.
In this paper, we adopt adaptive Huber loss accommodate heavy-tailed noise, which is robust against large and possibly asymmetric errors.
We prove that under merely a second moment condition on the error, the Euclidean error falls geometrically fast until achieving a minimax-optimal statistical estimation error.
arXiv Detail & Related papers (2022-06-09T04:48:48Z) - High-Order Qubit Dephasing at Sweet Spots by Non-Gaussian Fluctuators:
Symmetry Breaking and Floquet Protection [55.41644538483948]
We study the qubit dephasing caused by the non-Gaussian fluctuators.
We predict a symmetry-breaking effect that is unique to the non-Gaussian noise.
arXiv Detail & Related papers (2022-06-06T18:02:38Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Shape Matters: Understanding the Implicit Bias of the Noise Covariance [76.54300276636982]
Noise in gradient descent provides a crucial implicit regularization effect for training over parameterized models.
We show that parameter-dependent noise -- induced by mini-batches or label perturbation -- is far more effective than Gaussian noise.
Our analysis reveals that parameter-dependent noise introduces a bias towards local minima with smaller noise variance, whereas spherical Gaussian noise does not.
arXiv Detail & Related papers (2020-06-15T18:31:02Z) - Contextual Linear Bandits under Noisy Features: Towards Bayesian Oracles [65.9694455739978]
We study contextual linear bandit problems under feature uncertainty, where the features are noisy and have missing entries.
Our analysis reveals that the optimal hypothesis can significantly deviate from the underlying realizability function, depending on the noise characteristics.
This implies that classical approaches cannot guarantee a non-trivial regret bound.
arXiv Detail & Related papers (2017-03-03T21:39:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.