Efficient Evaluation of the Partition Function of RBMs with Annealed
Importance Sampling
- URL: http://arxiv.org/abs/2007.11926v1
- Date: Thu, 23 Jul 2020 10:59:04 GMT
- Title: Efficient Evaluation of the Partition Function of RBMs with Annealed
Importance Sampling
- Authors: Ferran Mazzanti and Enrique Romero
- Abstract summary: Annealed Importance Sampling (AIS) method provides a tool to estimate the partition function of the system.
We analyze the performance of AIS in both small- and large-sized problems, and show that in both cases a good estimation of Z can be obtained with little computational cost.
- Score: 0.30458514384586394
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Probabilistic models based on Restricted Boltzmann Machines (RBMs) imply the
evaluation of normalized Boltzmann factors, which in turn require from the
evaluation of the partition function Z. The exact evaluation of Z, though,
becomes a forbiddingly expensive task as the system size increases. This even
worsens when one considers most usual learning algorithms for RBMs, where the
exact evaluation of the gradient of the log-likelihood of the empirical
distribution of the data includes the computation of Z at each iteration. The
Annealed Importance Sampling (AIS) method provides a tool to stochastically
estimate the partition function of the system. So far, the standard use of the
AIS algorithm in the Machine Learning context has been done using a large
number of Monte Carlo steps. In this work we show that this may not be required
if a proper starting probability distribution is employed as the initialization
of the AIS algorithm. We analyze the performance of AIS in both small- and
large-sized problems, and show that in both cases a good estimation of Z can be
obtained with little computational cost.
Related papers
- Iterative Methods for Full-Scale Gaussian Process Approximations for Large Spatial Data [9.913418444556486]
We show how iterative methods can be used to reduce the computational costs for calculating likelihoods, gradients, and predictive distributions with FSAs.
We also present a novel, accurate, and fast way to calculate predictive variances relying on estimations and iterative methods.
All methods are implemented in a free C++ software library with high-level Python and R packages.
arXiv Detail & Related papers (2024-05-23T12:25:22Z) - Mean field initialization of the Annealed Importance Sampling algorithm for an efficient evaluation of the Partition Function of Restricted Boltzmann Machines [0.0]
Annealed Importance Sampling (AIS) is a tool to estimate the partition function of a system.
We show that both the quality of the estimation and the cost of the computation can be significantly improved by using a properly selected mean-field starting probability distribution.
We conclude that these are good starting points to estimate the partition function with AIS with a relatively low computational cost.
arXiv Detail & Related papers (2024-04-17T10:22:03Z) - Efficient Model-Free Exploration in Low-Rank MDPs [76.87340323826945]
Low-Rank Markov Decision Processes offer a simple, yet expressive framework for RL with function approximation.
Existing algorithms are either (1) computationally intractable, or (2) reliant upon restrictive statistical assumptions.
We propose the first provably sample-efficient algorithm for exploration in Low-Rank MDPs.
arXiv Detail & Related papers (2023-07-08T15:41:48Z) - Generalizing Backpropagation for Gradient-Based Interpretability [103.2998254573497]
We show that the gradient of a model is a special case of a more general formulation using semirings.
This observation allows us to generalize the backpropagation algorithm to efficiently compute other interpretable statistics.
arXiv Detail & Related papers (2023-07-06T15:19:53Z) - Provably Efficient Representation Learning with Tractable Planning in
Low-Rank POMDP [81.00800920928621]
We study representation learning in partially observable Markov Decision Processes (POMDPs)
We first present an algorithm for decodable POMDPs that combines maximum likelihood estimation (MLE) and optimism in the face of uncertainty (OFU)
We then show how to adapt this algorithm to also work in the broader class of $gamma$-observable POMDPs.
arXiv Detail & Related papers (2023-06-21T16:04:03Z) - Improvements to Supervised EM Learning of Shared Kernel Models by
Feature Space Partitioning [0.0]
This paper addresses the lack of rigour in the derivation of the EM training algorithm and the computational complexity of the technique.
We first present a detailed derivation of EM for the Gaussian shared kernel model PRBF classifier.
To reduce complexity of the resulting SKEM algorithm, we partition the feature space into $R$ non-overlapping subsets of variables.
arXiv Detail & Related papers (2022-05-31T09:18:58Z) - Parallel Stochastic Mirror Descent for MDPs [72.75921150912556]
We consider the problem of learning the optimal policy for infinite-horizon Markov decision processes (MDPs)
Some variant of Mirror Descent is proposed for convex programming problems with Lipschitz-continuous functionals.
We analyze this algorithm in a general case and obtain an estimate of the convergence rate that does not accumulate errors during the operation of the method.
arXiv Detail & Related papers (2021-02-27T19:28:39Z) - Sample-Efficient Reinforcement Learning of Undercomplete POMDPs [91.40308354344505]
This work shows that these hardness barriers do not preclude efficient reinforcement learning for rich and interesting subclasses of Partially Observable Decision Processes (POMDPs)
We present a sample-efficient algorithm, OOM-UCB, for episodic finite undercomplete POMDPs, where the number of observations is larger than the number of latent states and where exploration is essential for learning, thus distinguishing our results from prior works.
arXiv Detail & Related papers (2020-06-22T17:58:54Z) - Learning Minimax Estimators via Online Learning [55.92459567732491]
We consider the problem of designing minimax estimators for estimating parameters of a probability distribution.
We construct an algorithm for finding a mixed-case Nash equilibrium.
arXiv Detail & Related papers (2020-06-19T22:49:42Z) - Scalable Distributed Approximation of Internal Measures for Clustering
Evaluation [5.144809478361603]
Internal measure for clustering evaluation is the silhouette coefficient, whose computation requires a quadratic number of distance calculations.
We present the first scalable algorithm to compute such a rigorous approximation for the evaluation of clusterings based on any metric distances.
We also prove that the algorithm can be adapted to obtain rigorous approximations of other internal measures of clustering quality, such as cohesion and separation.
arXiv Detail & Related papers (2020-03-03T10:28:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.