Deep Involutive Generative Models for Neural MCMC
- URL: http://arxiv.org/abs/2006.15167v2
- Date: Thu, 2 Jul 2020 15:42:01 GMT
- Title: Deep Involutive Generative Models for Neural MCMC
- Authors: Span Spanbauer, Cameron Freer, Vikash Mansinghka
- Abstract summary: We introduce deep involutive generative models, a new architecture for deep generative modeling, and use them to define Involutive Neural MCMC.
We show how to make these models volume preserving, and how to use deep volume-preserving involutive generative models to make valid Metropolis-Hastings updates.
- Score: 3.6739949215165164
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce deep involutive generative models, a new architecture for deep
generative modeling, and use them to define Involutive Neural MCMC, a new
approach to fast neural MCMC. An involutive generative model represents a
probability kernel $G(\phi \mapsto \phi')$ as an involutive (i.e.,
self-inverting) deterministic function $f(\phi, \pi)$ on an enlarged state
space containing auxiliary variables $\pi$. We show how to make these models
volume preserving, and how to use deep volume-preserving involutive generative
models to make valid Metropolis-Hastings updates based on an auxiliary variable
scheme with an easy-to-calculate acceptance ratio. We prove that deep
involutive generative models and their volume-preserving special case are
universal approximators for probability kernels. This result implies that with
enough network capacity and training time, they can be used to learn
arbitrarily complex MCMC updates. We define a loss function and optimization
algorithm for training parameters given simulated data. We also provide initial
experiments showing that Involutive Neural MCMC can efficiently explore
multi-modal distributions that are intractable for Hybrid Monte Carlo, and can
converge faster than A-NICE-MC, a recently introduced neural MCMC technique.
Related papers
- Deep unfolding of MCMC kernels: scalable, modular & explainable GANs for high-dimensional posterior sampling [1.930761833716203]
We introduce a novel approach to GAN architecture design by applying deep unfolding to Langevin MCMC algorithms.<n>This paradigm maps fixed-step iterative algorithms onto modular neural networks, yielding architectures that are both flexible and amenable to interpretation.<n>We train these unfolded samplers end-to-end using a supervised regularized Wasserstein GAN framework for posterior sampling.
arXiv Detail & Related papers (2026-02-24T10:37:10Z) - FBMS: An R Package for Flexible Bayesian Model Selection and Model Averaging [14.487258585834374]
The FBMS package implements an efficient Mode Jumping Markov Chain Monte Carlo (MJMCMC) algorithm.<n>Within this framework, the algorithm maintains and updates populations of transformed features, computes their posterior probabilities, and evaluates the posteriors of models constructed from them.<n>We demonstrate the effective use of FBMS for both inferential and predictive modeling in Gaussian regression, focusing on different instances of the BGNLM class of models.
arXiv Detail & Related papers (2025-08-31T09:04:01Z) - Train Faster, Perform Better: Modular Adaptive Training in Over-Parameterized Models [31.960749305728488]
We introduce a novel concept dubbed modular neural tangent kernel (mNTK)
We show that the quality of a module's learning is tightly associated with its mNTK's principal eigenvalue $lambda_max$.
We propose a novel training strategy termed Modular Adaptive Training (MAT) to update those modules with their $lambda_max$ exceeding a dynamic threshold.
arXiv Detail & Related papers (2024-05-13T07:46:48Z) - Scaling and renormalization in high-dimensional regression [72.59731158970894]
We present a unifying perspective on recent results on ridge regression.<n>We use the basic tools of random matrix theory and free probability, aimed at readers with backgrounds in physics and deep learning.<n>Our results extend and provide a unifying perspective on earlier models of scaling laws.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Gaussian Process Neural Additive Models [3.7969209746164325]
We propose a new subclass of Neural Additive Models (NAMs) that use a single-layer neural network construction of the Gaussian process via random Fourier features.
GP-NAMs have the advantage of a convex objective function and number of trainable parameters that grows linearly with feature dimensionality.
We show that GP-NAM achieves comparable or better performance in both classification and regression tasks with a large reduction in the number of parameters.
arXiv Detail & Related papers (2024-02-19T20:29:34Z) - Generative Marginalization Models [21.971818180264943]
marginalization models (MAMs) are a new family of generative models for high-dimensional discrete data.
They offer scalable and flexible generative modeling by explicitly modeling all induced marginal distributions.
For energy-based training tasks, MAMs enable any-order generative modeling of high-dimensional problems beyond the scale of previous methods.
arXiv Detail & Related papers (2023-10-19T17:14:29Z) - SE(3)-Stochastic Flow Matching for Protein Backbone Generation [54.951832422425454]
We introduce FoldFlow, a series of novel generative models of increasing modeling power based on the flow-matching paradigm over $3mathrmD$ rigid motions.
Our family of FoldFlowgenerative models offers several advantages over previous approaches to the generative modeling of proteins.
arXiv Detail & Related papers (2023-10-03T19:24:24Z) - Fast variable selection makes scalable Gaussian process BSS-ANOVA a
speedy and accurate choice for tabular and time series regression [0.0]
Gaussian processes (GPs) are non-parametric regression engines with a long history.
One of a number of scalable GP approaches is the Karhunen-Lo'eve (KL) decomposed kernel BSS-ANOVA, developed in 2009.
A new method of forward variable selection, quickly and effectively limits the number of terms, yielding a method with competitive accuracies.
arXiv Detail & Related papers (2022-05-26T23:41:43Z) - A new perspective on probabilistic image modeling [92.89846887298852]
We present a new probabilistic approach for image modeling capable of density estimation, sampling and tractable inference.
DCGMMs can be trained end-to-end by SGD from random initial conditions, much like CNNs.
We show that DCGMMs compare favorably to several recent PC and SPN models in terms of inference, classification and sampling.
arXiv Detail & Related papers (2022-03-21T14:53:57Z) - Low-Rank Constraints for Fast Inference in Structured Models [110.38427965904266]
This work demonstrates a simple approach to reduce the computational and memory complexity of a large class of structured models.
Experiments with neural parameterized structured models for language modeling, polyphonic music modeling, unsupervised grammar induction, and video modeling show that our approach matches the accuracy of standard models at large state spaces.
arXiv Detail & Related papers (2022-01-08T00:47:50Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - Structured Stochastic Gradient MCMC [20.68905354115655]
We propose a new non-parametric variational approximation that makes no assumptions about the approximate posterior's functional form.
We obtain better predictive likelihoods and larger effective sample sizes than full SGMCMC.
arXiv Detail & Related papers (2021-07-19T17:18:10Z) - Closed-form Continuous-Depth Models [99.40335716948101]
Continuous-depth neural models rely on advanced numerical differential equation solvers.
We present a new family of models, termed Closed-form Continuous-depth (CfC) networks, that are simple to describe and at least one order of magnitude faster.
arXiv Detail & Related papers (2021-06-25T22:08:51Z) - Sparse Flows: Pruning Continuous-depth Models [107.98191032466544]
We show that pruning improves generalization for neural ODEs in generative modeling.
We also show that pruning finds minimal and efficient neural ODE representations with up to 98% less parameters compared to the original network, without loss of accuracy.
arXiv Detail & Related papers (2021-06-24T01:40:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.