Automatic variational inference with cascading flows
- URL: http://arxiv.org/abs/2102.04801v1
- Date: Tue, 9 Feb 2021 12:44:39 GMT
- Title: Automatic variational inference with cascading flows
- Authors: Luca Ambrogioni, Gianluigi Silvestri and Marcel van Gerven
- Abstract summary: We present a new family of variational programs that embed the forward-pass.
A cascading flows program interposes a newly designed highway flow architecture in between the conditional distributions of the prior program.
We evaluate the performance of the new variational programs in a series of structured inference problems.
- Score: 6.252236971703546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The automation of probabilistic reasoning is one of the primary aims of
machine learning. Recently, the confluence of variational inference and deep
learning has led to powerful and flexible automatic inference methods that can
be trained by stochastic gradient descent. In particular, normalizing flows are
highly parameterized deep models that can fit arbitrarily complex posterior
densities. However, normalizing flows struggle in highly structured
probabilistic programs as they need to relearn the forward-pass of the program.
Automatic structured variational inference (ASVI) remedies this problem by
constructing variational programs that embed the forward-pass. Here, we combine
the flexibility of normalizing flows and the prior-embedding property of ASVI
in a new family of variational programs, which we named cascading flows. A
cascading flows program interposes a newly designed highway flow architecture
in between the conditional distributions of the prior program such as to steer
it toward the observed data. These programs can be constructed automatically
from an input probabilistic program and can also be amortized automatically. We
evaluate the performance of the new variational programs in a series of
structured inference problems. We find that cascading flows have much higher
performance than both normalizing flows and ASVI in a large set of structured
inference problems.
Related papers
- Rethinking Variational Inference for Probabilistic Programs with
Stochastic Support [23.07504711090434]
We introduce Support Decomposition Vari Inference (SDVI), a new variational inference (VI) approach for probabilistic programs with support.
SDVI instead breaks the program down into sub-programs with static support, before automatically building separate sub-guides for each.
This decomposition significantly aids in the construction of suitable variational families, enabling, in turn, substantial improvements in inference performance.
arXiv Detail & Related papers (2023-11-01T15:38:51Z) - Free-form Flows: Make Any Architecture a Normalizing Flow [8.163244519983298]
We develop a training procedure that uses an efficient estimator for the gradient of the change of variables formula.
This enables any dimension-preserving neural network to serve as a generative model through maximum likelihood training.
We achieve excellent results in molecule generation benchmarks utilizing $E(n)$-equivariant networks.
arXiv Detail & Related papers (2023-10-25T13:23:08Z) - Smoothing Methods for Automatic Differentiation Across Conditional
Branches [0.0]
Smooth interpretation (SI) approximates the convolution of a program's output with a Gaussian kernel, thus smoothing its output in a principled manner.
We combine SI with automatic differentiation (AD) to efficiently compute gradients of smoothed programs.
We propose a novel Monte Carlo estimator that avoids the underlying assumptions by estimating the smoothed programs' gradients through a combination of AD and sampling.
arXiv Detail & Related papers (2023-10-05T15:08:37Z) - Transformers as Statisticians: Provable In-Context Learning with
In-Context Algorithm Selection [88.23337313766353]
This work first provides a comprehensive statistical theory for transformers to perform ICL.
We show that transformers can implement a broad class of standard machine learning algorithms in context.
A emphsingle transformer can adaptively select different base ICL algorithms.
arXiv Detail & Related papers (2023-06-07T17:59:31Z) - Distributional GFlowNets with Quantile Flows [73.73721901056662]
Generative Flow Networks (GFlowNets) are a new family of probabilistic samplers where an agent learns a policy for generating complex structure through a series of decision-making steps.
In this work, we adopt a distributional paradigm for GFlowNets, turning each flow function into a distribution, thus providing more informative learning signals during training.
Our proposed textitquantile matching GFlowNet learning algorithm is able to learn a risk-sensitive policy, an essential component for handling scenarios with risk uncertainty.
arXiv Detail & Related papers (2023-02-11T22:06:17Z) - Offline Model-Based Optimization via Normalized Maximum Likelihood
Estimation [101.22379613810881]
We consider data-driven optimization problems where one must maximize a function given only queries at a fixed set of points.
This problem setting emerges in many domains where function evaluation is a complex and expensive process.
We propose a tractable approximation that allows us to scale our method to high-capacity neural network models.
arXiv Detail & Related papers (2021-02-16T06:04:27Z) - Using Differentiable Programming for Flexible Statistical Modeling [0.0]
Differentiable programming has recently received much interest as a paradigm that facilitates taking gradients of computer programs.
We show how differentiable programming can enable simple gradient-based optimization of a model by automatic differentiation.
This allowed us to quickly prototype a model under time pressure that outperforms simpler benchmark models.
arXiv Detail & Related papers (2020-12-07T12:33:49Z) - Self Normalizing Flows [65.73510214694987]
We propose a flexible framework for training normalizing flows by replacing expensive terms in the gradient by learned approximate inverses at each layer.
This reduces the computational complexity of each layer's exact update from $mathcalO(D3)$ to $mathcalO(D2)$.
We show experimentally that such models are remarkably stable and optimize to similar data likelihood values as their exact gradient counterparts.
arXiv Detail & Related papers (2020-11-14T09:51:51Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z) - Automatic structured variational inference [12.557212589634112]
We introduce automatic structured variational inference (ASVI)
ASVI is a fully automated method for constructing structured variational families.
We find that ASVI provides a clear improvement in performance when compared with other popular approaches.
arXiv Detail & Related papers (2020-02-03T10:52:30Z) - Learning Likelihoods with Conditional Normalizing Flows [54.60456010771409]
Conditional normalizing flows (CNFs) are efficient in sampling and inference.
We present a study of CNFs where the base density to output space mapping is conditioned on an input x, to model conditional densities p(y|x)
arXiv Detail & Related papers (2019-11-29T19:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.