Embedded-model flows: Combining the inductive biases of model-free deep
learning and explicit probabilistic modeling
- URL: http://arxiv.org/abs/2110.06021v1
- Date: Tue, 12 Oct 2021 14:12:16 GMT
- Title: Embedded-model flows: Combining the inductive biases of model-free deep
learning and explicit probabilistic modeling
- Authors: Gianluigi Silvestri, Emily Fertig, Dave Moore, Luca Ambrogioni
- Abstract summary: We propose embedded-model flows which alternate general-purpose transformations with structured layers that embed domain-specific inductive biases.
We demonstrate that EMFs can be used to induce desirable properties such as multimodality, hierarchical coupling and continuity.
In experiments, we show that this approach outperforms state-of-the-art methods in common structured inference problems.
- Score: 8.405013085269976
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Normalizing flows have shown great success as general-purpose density
estimators. However, many real world applications require the use of
domain-specific knowledge, which normalizing flows cannot readily incorporate.
We propose embedded-model flows(EMF), which alternate general-purpose
transformations with structured layers that embed domain-specific inductive
biases. These layers are automatically constructed by converting user-specified
differentiable probabilistic models into equivalent bijective transformations.
We also introduce gated structured layers, which allow bypassing the parts of
the models that fail to capture the statistics of the data. We demonstrate that
EMFs can be used to induce desirable properties such as multimodality,
hierarchical coupling and continuity. Furthermore, we show that EMFs enable a
high performance form of variational inference where the structure of the prior
model is embedded in the variational architecture. In our experiments, we show
that this approach outperforms state-of-the-art methods in common structured
inference problems.
Related papers
- Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - Explaining Modern Gated-Linear RNNs via a Unified Implicit Attention Formulation [54.50526986788175]
Recent advances in efficient sequence modeling have led to attention-free layers, such as Mamba, RWKV, and various gated RNNs.
We present a unified view of these models, formulating such layers as implicit causal self-attention layers.
Our framework compares the underlying mechanisms on similar grounds for different layers and provides a direct means for applying explainability methods.
arXiv Detail & Related papers (2024-05-26T09:57:45Z) - Attention Mechanisms Don't Learn Additive Models: Rethinking Feature Importance for Transformers [12.986126243018452]
We introduce the Softmax-Linked Additive Log-Odds Model (SLALOM), a novel surrogate model specifically designed to align with the transformer framework.
SLALOM demonstrates the capacity to deliver a range of faithful and insightful explanations across both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-05-22T11:14:00Z) - Uncertainty quantification and out-of-distribution detection using
surjective normalizing flows [46.51077762143714]
We propose a simple approach using surjective normalizing flows to identify out-of-distribution data sets in deep neural network models.
We show that our method can reliably discern out-of-distribution data from in-distribution data.
arXiv Detail & Related papers (2023-11-01T09:08:35Z) - DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained
Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states.
We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs.
Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z) - Moser Flow: Divergence-based Generative Modeling on Manifolds [49.04974733536027]
Moser Flow (MF) is a new class of generative models within the family of continuous normalizing flows (CNF)
MF does not require invoking or backpropagating through an ODE solver during training.
We demonstrate for the first time the use of flow models for sampling from general curved surfaces.
arXiv Detail & Related papers (2021-08-18T09:00:24Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z) - Deep Conditional Transformation Models [0.0]
Learning the cumulative distribution function (CDF) of an outcome variable conditional on a set of features remains challenging.
Conditional transformation models provide a semi-parametric approach that allows to model a large class of conditional CDFs.
We propose a novel network architecture, provide details on different model definitions and derive suitable constraints.
arXiv Detail & Related papers (2020-10-15T16:25:45Z) - Variational Mixture of Normalizing Flows [0.0]
Deep generative models, such as generative adversarial networks autociteGAN, variational autoencoders autocitevaepaper, and their variants, have seen wide adoption for the task of modelling complex data distributions.
Normalizing flows have overcome this limitation by leveraging the change-of-suchs formula for probability density functions.
The present work overcomes this by using normalizing flows as components in a mixture model and devising an end-to-end training procedure for such a model.
arXiv Detail & Related papers (2020-09-01T17:20:08Z) - Learning Likelihoods with Conditional Normalizing Flows [54.60456010771409]
Conditional normalizing flows (CNFs) are efficient in sampling and inference.
We present a study of CNFs where the base density to output space mapping is conditioned on an input x, to model conditional densities p(y|x)
arXiv Detail & Related papers (2019-11-29T19:17:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.