Local convexity of the TAP free energy and AMP convergence for
Z2-synchronization
- URL: http://arxiv.org/abs/2106.11428v3
- Date: Sat, 15 Apr 2023 18:06:10 GMT
- Title: Local convexity of the TAP free energy and AMP convergence for
Z2-synchronization
- Authors: Michael Celentano, Zhou Fan, Song Mei
- Abstract summary: We study mean-field variational Bayesian inference using the TAP approach for Z2-synchronization.
We show that for any signal strength $lambda > 1$, there exists a unique local minimizer of the TAP free energy functional near the mean of the Bayes posterior law.
- Score: 17.940267599218082
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We study mean-field variational Bayesian inference using the TAP approach,
for Z2-synchronization as a prototypical example of a high-dimensional Bayesian
model. We show that for any signal strength $\lambda > 1$ (the weak-recovery
threshold), there exists a unique local minimizer of the TAP free energy
functional near the mean of the Bayes posterior law. Furthermore, the TAP free
energy in a local neighborhood of this minimizer is strongly convex.
Consequently, a natural-gradient/mirror-descent algorithm achieves linear
convergence to this minimizer from a local initialization, which may be
obtained by a constant number of iterates of Approximate Message Passing (AMP).
This provides a rigorous foundation for variational inference in high
dimensions via minimization of the TAP free energy.
We also analyze the finite-sample convergence of AMP, showing that AMP is
asymptotically stable at the TAP minimizer for any $\lambda > 1$, and is
linearly convergent to this minimizer from a spectral initialization for
sufficiently large $\lambda$. Such a guarantee is stronger than results
obtainable by state evolution analyses, which only describe a fixed number of
AMP iterations in the infinite-sample limit.
Our proofs combine the Kac-Rice formula and Sudakov-Fernique Gaussian
comparison inequality to analyze the complexity of critical points that satisfy
strong convexity and stability conditions within their local neighborhoods.
Related papers
- Flow matching achieves almost minimax optimal convergence [50.38891696297888]
Flow matching (FM) has gained significant attention as a simulation-free generative model.
This paper discusses the convergence properties of FM for large sample size under the $p$-Wasserstein distance.
We establish that FM can achieve an almost minimax optimal convergence rate for $1 leq p leq 2$, presenting the first theoretical evidence that FM can reach convergence rates comparable to those of diffusion models.
arXiv Detail & Related papers (2024-05-31T14:54:51Z) - Non-asymptotic estimates for accelerated high order Langevin Monte Carlo algorithms [8.058385158111207]
We propose two new algorithms, namely aHOLA and aHOLLA, to sample from high-dimensional target distributions.
We establish non-asymptotic convergence rates for aHOLA in Wasserstein-1 and Wasserstein-2 distributions.
arXiv Detail & Related papers (2024-05-09T11:12:03Z) - Mean-field variational inference with the TAP free energy: Geometric and
statistical properties in linear models [20.311583934266903]
We show that the landscape of the TAP free energy is strongly convex in an extensive neighborhood of a local minimizer.
We prove analogous properties for a local minimizer of the TAP free energy reachable by and show posterior inference based on this minimizer remains correctly.
arXiv Detail & Related papers (2023-11-14T17:35:01Z) - Sharper Analysis for Minibatch Stochastic Proximal Point Methods:
Stability, Smoothness, and Deviation [41.082982732100696]
We study a minibatch variant of proximal point (SPP) methods, namely M-SPP, for solving convex composite risk minimization problems.
We show that M-SPP with minibatch-size $n$ and quadratic count $T$ enjoys an in-expectation fast rate of convergence.
In the small-$n$-large-$T$ setting, this result substantially improves the best known results of SPP-type approaches.
arXiv Detail & Related papers (2023-01-09T00:13:34Z) - Asymptotically Unbiased Instance-wise Regularized Partial AUC
Optimization: Theory and Algorithm [101.44676036551537]
One-way Partial AUC (OPAUC) and Two-way Partial AUC (TPAUC) measures the average performance of a binary classifier.
Most of the existing methods could only optimize PAUC approximately, leading to inevitable biases that are not controllable.
We present a simpler reformulation of the PAUC problem via distributional robust optimization AUC.
arXiv Detail & Related papers (2022-10-08T08:26:22Z) - Sudakov-Fernique post-AMP, and a new proof of the local convexity of the
TAP free energy [0.6091702876917281]
We derive a comparison inequality method involving properties related but distinct free energy.
We prove a conjecture involving El Alaoui et al. (2022) involving an algorithm of related but distinct free energy.
arXiv Detail & Related papers (2022-08-19T21:21:06Z) - KL-Entropy-Regularized RL with a Generative Model is Minimax Optimal [70.15267479220691]
We consider and analyze the sample complexity of model reinforcement learning with a generative variance-free model.
Our analysis shows that it is nearly minimax-optimal for finding an $varepsilon$-optimal policy when $varepsilon$ is sufficiently small.
arXiv Detail & Related papers (2022-05-27T19:39:24Z) - Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and
Beyond [63.59034509960994]
We study shuffling-based variants: minibatch and local Random Reshuffling, which draw gradients without replacement.
For smooth functions satisfying the Polyak-Lojasiewicz condition, we obtain convergence bounds which show that these shuffling-based variants converge faster than their with-replacement counterparts.
We propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.
arXiv Detail & Related papers (2021-10-20T02:25:25Z) - Optimal policy evaluation using kernel-based temporal difference methods [78.83926562536791]
We use kernel Hilbert spaces for estimating the value function of an infinite-horizon discounted Markov reward process.
We derive a non-asymptotic upper bound on the error with explicit dependence on the eigenvalues of the associated kernel operator.
We prove minimax lower bounds over sub-classes of MRPs.
arXiv Detail & Related papers (2021-09-24T14:48:20Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Convergence Rates of Stochastic Gradient Descent under Infinite Noise
Variance [14.06947898164194]
Heavy tails emerge in gradient descent (SGD) in various scenarios.
We provide convergence guarantees for SGD under a state-dependent and heavy-tailed noise with a potentially infinite variance.
Our results indicate that even under heavy-tailed noise with infinite variance, SGD can converge to the global optimum.
arXiv Detail & Related papers (2021-02-20T13:45:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.