Exact Fractional Inference via Re-Parametrization & Interpolation
between Tree-Re-Weighted- and Belief Propagation- Algorithms
- URL: http://arxiv.org/abs/2301.10369v2
- Date: Wed, 6 Mar 2024 15:25:57 GMT
- Title: Exact Fractional Inference via Re-Parametrization & Interpolation
between Tree-Re-Weighted- and Belief Propagation- Algorithms
- Authors: Hamidreza Behjoo, Michael Chertkov
- Abstract summary: Inference efforts -- required to compute partition function, $Z$, of an Ising model over a graph of $N$ spins" -- are most likely exponential in $N$.
We show how to express $Z$ as a product, $forall lambda: Z=Z(lambda)cal Z(lambda)$, where the multiplicative correction, $cal Z(lambda)$, is an expectation over a node-independent probability distribution built from node-wise fractional marginals.
- Score: 0.5348370085388683
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Inference efforts -- required to compute partition function, $Z$, of an Ising
model over a graph of $N$ ``spins" -- are most likely exponential in $N$.
Efficient variational methods, such as Belief Propagation (BP) and Tree
Re-Weighted (TRW) algorithms, compute $Z$ approximately minimizing respective
(BP- or TRW-) free energy. We generalize the variational scheme building a
$\lambda$-fractional-homotopy, $Z^{(\lambda)}$, where $\lambda=0$ and
$\lambda=1$ correspond to TRW- and BP-approximations, respectively, and
$Z^{(\lambda)}$ decreases with $\lambda$ monotonically. Moreover, this
fractional scheme guarantees that in the attractive (ferromagnetic) case
$Z^{(TRW)}\geq Z^{(\lambda)}\geq Z^{(BP)}$, and there exists a unique
(``exact") $\lambda_*$ such that, $Z=Z^{(\lambda_*)}$. Generalizing the
re-parametrization approach of \citep{wainwright_tree-based_2002} and the loop
series approach of \citep{chertkov_loop_2006}, we show how to express $Z$ as a
product, $\forall \lambda:\ Z=Z^{(\lambda)}{\cal Z}^{(\lambda)}$, where the
multiplicative correction, ${\cal Z}^{(\lambda)}$, is an expectation over a
node-independent probability distribution built from node-wise fractional
marginals. Our theoretical analysis is complemented by extensive experiments
with models from Ising ensembles over planar and random graphs of medium- and
large- sizes. The empirical study yields a number of interesting observations,
such as (a) ability to estimate ${\cal Z}^{(\lambda)}$ with $O(N^4)$ fractional
samples; (b) suppression of $\lambda_*$ fluctuations with increase in $N$ for
instances from a particular random Ising ensemble.
Related papers
- Manifold learning in Wasserstein space [3.2315169215372443]
We show that the metric space $(Lambda,W_Lambda)$ can be recovered in the sense of Gromov--Wasserstein from a graph with nodes $lambda_i_i=1N$ and edge weights $W(lambda_i,lambda_j)$.
In particular, we show that the metric space $(Lambda,W_Lambda)$ can be recovered in the sense of Gromov--Wasserstein from a graph with nodes $lambda_i_
arXiv Detail & Related papers (2023-11-14T21:21:35Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Asymptotic Theory of $\ell_1$-Regularized PDE Identification from a
Single Noisy Trajectory [2.0299248281970956]
We prove the support recovery for a general class of linear and nonlinear evolutionary partial differential equation (PDE) identification from a single noisy trajectory.
We provide a set of sufficient conditions which guarantee that, from a single trajectory data denoised by a Local-Polynomial filter, the support of $mathbfc(lambda)$ally converges to the true signed-support associated with the underlying PDE.
arXiv Detail & Related papers (2021-03-12T02:23:04Z) - An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z) - $k$-Forrelation Optimally Separates Quantum and Classical Query
Complexity [3.4984289152418753]
We show that any partial function on $N$ bits can be computed with an advantage $delta$ over a random guess by making $q$ quantum queries.
We also conjectured the $k$-Forrelation problem -- a partial function that can be computed with $q = lceil k/2 rceil$ quantum queries.
arXiv Detail & Related papers (2020-08-16T21:26:46Z) - $\lambda$-Regularized A-Optimal Design and its Approximation by
$\lambda$-Regularized Proportional Volume Sampling [1.256413718364189]
We study the $lambda$-regularized $A$-optimal design problem and introduce the $lambda$-regularized proportional volume sampling algorithm.
The problem is motivated from optimal design in ridge regression, where one tries to minimize the expected squared error of the ridge regression predictor from the true coefficient in the underlying linear model.
arXiv Detail & Related papers (2020-06-19T15:17:57Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - A Randomized Algorithm to Reduce the Support of Discrete Measures [79.55586575988292]
Given a discrete probability measure supported on $N$ atoms and a set of $n$ real-valued functions, there exists a probability measure that is supported on a subset of $n+1$ of the original $N$ atoms.
We give a simple geometric characterization of barycenters via negative cones and derive a randomized algorithm that computes this new measure by "greedy geometric sampling"
We then study its properties, and benchmark it on synthetic and real-world data to show that it can be very beneficial in the $Ngg n$ regime.
arXiv Detail & Related papers (2020-06-02T16:38:36Z) - On the Complexity of Minimizing Convex Finite Sums Without Using the
Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model.
Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.