Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise
- URL: http://arxiv.org/abs/2405.20993v2
- Date: Mon, 8 Jul 2024 16:26:03 GMT
- Title: Information limits and Thouless-Anderson-Palmer equations for spiked matrix models with structured noise
- Authors: Jean Barbier, Francesco Camilli, Marco Mondelli, Yizhou Xu,
- Abstract summary: We consider a saturate problem of Bayesian inference for a structured spiked model.
We show how to predict the statistical limits using an efficient algorithm inspired by the theory of adaptive Thouless-Anderson-Palmer equations.
- Score: 19.496063739638924
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider a prototypical problem of Bayesian inference for a structured spiked model: a low-rank signal is corrupted by additive noise. While both information-theoretic and algorithmic limits are well understood when the noise is a Gaussian Wigner matrix, the more realistic case of structured noise still proves to be challenging. To capture the structure while maintaining mathematical tractability, a line of work has focused on rotationally invariant noise. However, existing studies either provide sub-optimal algorithms or are limited to special cases of noise ensembles. In this paper, using tools from statistical physics (replica method) and random matrix theory (generalized spherical integrals) we establish the first characterization of the information-theoretic limits for a noise matrix drawn from a general trace ensemble. Remarkably, our analysis unveils the asymptotic equivalence between the rotationally invariant model and a surrogate Gaussian one. Finally, we show how to saturate the predicted statistical limits using an efficient algorithm inspired by the theory of adaptive Thouless-Anderson-Palmer (TAP) equations.
Related papers
- Accelerated zero-order SGD under high-order smoothness and overparameterized regime [79.85163929026146]
We present a novel gradient-free algorithm to solve convex optimization problems.
Such problems are encountered in medicine, physics, and machine learning.
We provide convergence guarantees for the proposed algorithm under both types of noise.
arXiv Detail & Related papers (2024-11-21T10:26:17Z) - Robust Learning under Hybrid Noise [24.36707245704713]
We propose a novel unified learning framework called "Feature and Label Recovery" (FLR) to combat the hybrid noise from the perspective of data recovery.
arXiv Detail & Related papers (2024-07-04T16:13:25Z) - Bayesian Inference of General Noise Model Parameters from Surface Code's Syndrome Statistics [0.0]
We propose general noise model Bayesian inference methods that integrate the surface code's tensor network simulator.
For stationary noise, where the noise parameters are constant and do not change, we propose a method based on the Markov chain Monte Carlo.
For time-varying noise, which is a more realistic situation, we introduce another method based on the sequential Monte Carlo.
arXiv Detail & Related papers (2024-06-13T10:26:04Z) - Matrix Denoising with Doubly Heteroscedastic Noise: Fundamental Limits and Optimal Spectral Methods [24.06775799553418]
We study the matrix denoising problem of estimating the singular vectors of a rank-$1$ signal corrupted by noise with both column and row correlations.
Our work establishes the information-theoretic and algorithmic limits of matrix denoising with doubly heteroscedastic noise.
arXiv Detail & Related papers (2024-05-22T18:38:10Z) - Optimal Algorithms for the Inhomogeneous Spiked Wigner Model [89.1371983413931]
We derive an approximate message-passing algorithm (AMP) for the inhomogeneous problem.
We identify in particular the existence of a statistical-to-computational gap where known algorithms require a signal-to-noise ratio bigger than the information-theoretic threshold to perform better than random.
arXiv Detail & Related papers (2023-02-13T19:57:17Z) - Bayes-optimal limits in structured PCA, and how to reach them [21.3083877172595]
We study the paradigmatic matrix model of principal components analysis (PCA), where a rank-one matrix is corrupted by additive noise.
We provide the first characterization of the Bayes-optimal limits of inference in this model.
We propose a novel approximate message passing algorithm (AMP), inspired by the theory of Adaptive Thouless-Anderson-Palmer equations.
arXiv Detail & Related papers (2022-10-03T21:31:41Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Optimizing Information-theoretical Generalization Bounds via Anisotropic
Noise in SGLD [73.55632827932101]
We optimize the information-theoretical generalization bound by manipulating the noise structure in SGLD.
We prove that with constraint to guarantee low empirical risk, the optimal noise covariance is the square root of the expected gradient covariance.
arXiv Detail & Related papers (2021-10-26T15:02:27Z) - Fitting quantum noise models to tomography data [0.0]
We develop algorithms to analyse and evaluate unknown noise processes.
In the case of dynamics consistent with Markovian evolution, our algorithm outputs the best-fit Lindbladian.
In the case of non-Markovian dynamics, our algorithm returns a quantitative and operationally meaningful measure of non-Markovianity.
arXiv Detail & Related papers (2021-03-31T17:44:50Z) - Shape Matters: Understanding the Implicit Bias of the Noise Covariance [76.54300276636982]
Noise in gradient descent provides a crucial implicit regularization effect for training over parameterized models.
We show that parameter-dependent noise -- induced by mini-batches or label perturbation -- is far more effective than Gaussian noise.
Our analysis reveals that parameter-dependent noise introduces a bias towards local minima with smaller noise variance, whereas spherical Gaussian noise does not.
arXiv Detail & Related papers (2020-06-15T18:31:02Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.