Exact Fractional Inference via Re-Parametrization & Interpolation between Tree-Re-Weighted- and Belief Propagation- Algorithms
- URL: http://arxiv.org/abs/2301.10369v4
- Date: Wed, 13 Nov 2024 10:35:25 GMT
- Title: Exact Fractional Inference via Re-Parametrization & Interpolation between Tree-Re-Weighted- and Belief Propagation- Algorithms
- Authors: Hamidreza Behjoo, Michael Chertkov,
- Abstract summary: We show how to express $Z$ as a product, $forall lambda: Z=Z(lambda)tilde Z(lambda)$, where the multiplicative correction, $tilde Z(lambda)$, is an expectation over a node-independent probability distribution.
We also discuss the applicability of this approach to the problem of image de-noising.
- Score: 0.4527270266697462
- License:
- Abstract: Computing the partition function, $Z$, of an Ising model over a graph of $N$ \enquote{spins} is most likely exponential in $N$. Efficient variational methods, such as Belief Propagation (BP) and Tree Re-Weighted (TRW) algorithms, compute $Z$ approximately by minimizing the respective (BP- or TRW-) free energy. We generalize the variational scheme by building a $\lambda$-fractional interpolation, $Z^{(\lambda)}$, where $\lambda=0$ and $\lambda=1$ correspond to TRW- and BP-approximations, respectively. This fractional scheme -- coined Fractional Belief Propagation (FBP) -- guarantees that in the attractive (ferromagnetic) case $Z^{(TRW)} \geq Z^{(\lambda)} \geq Z^{(BP)}$, and there exists a unique (\enquote{exact}) $\lambda_*$ such that $Z=Z^{(\lambda_*)}$. Generalizing the re-parametrization approach of \citep{wainwright_tree-based_2002} and the loop series approach of \citep{chertkov_loop_2006}, we show how to express $Z$ as a product, $\forall \lambda:\ Z=Z^{(\lambda)}{\tilde Z}^{(\lambda)}$, where the multiplicative correction, ${\tilde Z}^{(\lambda)}$, is an expectation over a node-independent probability distribution built from node-wise fractional marginals. Our theoretical analysis is complemented by extensive experiments with models from Ising ensembles over planar and random graphs of medium and large sizes. Our empirical study yields a number of interesting observations, such as the ability to estimate ${\tilde Z}^{(\lambda)}$ with $O(N^{2::4})$ fractional samples and suppression of variation in $\lambda_*$ estimates with an increase in $N$ for instances from a particular random Ising ensemble, where $[2::4]$ indicates a range from $2$ to $4$. We also discuss the applicability of this approach to the problem of image de-noising.
Related papers
- On the $O(\frac{\sqrt{d}}{T^{1/4}})$ Convergence Rate of RMSProp and Its Momentum Extension Measured by $\ell_1$ Norm [59.65871549878937]
This paper considers the RMSProp and its momentum extension and establishes the convergence rate of $frac1Tsum_k=1T.
Our convergence rate matches the lower bound with respect to all the coefficients except the dimension $d$.
Our convergence rate can be considered to be analogous to the $frac1Tsum_k=1T.
arXiv Detail & Related papers (2024-02-01T07:21:32Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Quantum-information theory of a Dirichlet ring with Aharonov-Bohm field [0.0]
Shannon information entropies $S_rho,gamma$, Fisher informations $I_rho,gamma$, Onicescu energies $O_rho,gamma$ and R'enyi entropies $R_rho,gamma(alpha)$ are calculated.
arXiv Detail & Related papers (2022-02-09T19:26:56Z) - Sparse sketches with small inversion bias [79.77110958547695]
Inversion bias arises when averaging estimates of quantities that depend on the inverse covariance.
We develop a framework for analyzing inversion bias, based on our proposed concept of an $(epsilon,delta)$-unbiased estimator for random matrices.
We show that when the sketching matrix $S$ is dense and has i.i.d. sub-gaussian entries, the estimator $(epsilon,delta)$-unbiased for $(Atop A)-1$ with a sketch of size $m=O(d+sqrt d/
arXiv Detail & Related papers (2020-11-21T01:33:15Z) - $k$-Forrelation Optimally Separates Quantum and Classical Query
Complexity [3.4984289152418753]
We show that any partial function on $N$ bits can be computed with an advantage $delta$ over a random guess by making $q$ quantum queries.
We also conjectured the $k$-Forrelation problem -- a partial function that can be computed with $q = lceil k/2 rceil$ quantum queries.
arXiv Detail & Related papers (2020-08-16T21:26:46Z) - Variational Orthogonal Features [29.636483122130027]
We show that for certain priors, features can be defined that remove the $mathcalO(M3)$ cost of computing a minibatch estimate of an evidence lower bound (ELBO)
We present a construction of features for any stationary prior kernel that allow for computation of an unbiased estimator to the ELBO using $T$ Monte Carlo samples in $mathcalO(tildeNT+M2T)$ and in $mathcalO(tildeNT+MT)$ with an additional approximation.
arXiv Detail & Related papers (2020-06-23T17:18:07Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - On the Complexity of Minimizing Convex Finite Sums Without Using the
Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model.
Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.