Variational Inference Failures Under Model Symmetries: Permutation Invariant Posteriors for Bayesian Neural Networks
- URL: http://arxiv.org/abs/2408.05496v1
- Date: Sat, 10 Aug 2024 09:06:34 GMT
- Title: Variational Inference Failures Under Model Symmetries: Permutation Invariant Posteriors for Bayesian Neural Networks
- Authors: Yoav Gelberg, Tycho F. A. van der Ouderaa, Mark van der Wilk, Yarin Gal,
- Abstract summary: We investigate the impact of weight space permutation symmetries on variational inference.
We devise a symmetric symmetrization mechanism for constructing permutation invariant variational posteriors.
We show that the symmetrized distribution has a strictly better fit to the true posterior, and that it can be trained using the original ELBO objective.
- Score: 43.88179780450706
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Weight space symmetries in neural network architectures, such as permutation symmetries in MLPs, give rise to Bayesian neural network (BNN) posteriors with many equivalent modes. This multimodality poses a challenge for variational inference (VI) techniques, which typically rely on approximating the posterior with a unimodal distribution. In this work, we investigate the impact of weight space permutation symmetries on VI. We demonstrate, both theoretically and empirically, that these symmetries lead to biases in the approximate posterior, which degrade predictive performance and posterior fit if not explicitly accounted for. To mitigate this behavior, we leverage the symmetric structure of the posterior and devise a symmetrization mechanism for constructing permutation invariant variational posteriors. We show that the symmetrized distribution has a strictly better fit to the true posterior, and that it can be trained using the original ELBO objective with a modified KL regularization term. We demonstrate experimentally that our approach mitigates the aforementioned biases and results in improved predictions and a higher ELBO.
Related papers
- Approximate Equivariance in Reinforcement Learning [35.04248486334824]
Equivariant neural networks have shown great success in reinforcement learning.
In many problems, only approximate symmetry is present, which makes imposing exact symmetry inappropriate.
We develop approximately equivariant algorithms in reinforcement learning.
arXiv Detail & Related papers (2024-11-06T19:44:46Z) - Relative Representations: Topological and Geometric Perspectives [53.88896255693922]
Relative representations are an established approach to zero-shot model stitching.
We introduce a normalization procedure in the relative transformation, resulting in invariance to non-isotropic rescalings and permutations.
Second, we propose to deploy topological densification when fine-tuning relative representations, a topological regularization loss encouraging clustering within classes.
arXiv Detail & Related papers (2024-09-17T08:09:22Z) - Reparameterization invariance in approximate Bayesian inference [32.88960624085645]
We develop a new geometric view of reparametrizations from which we explain the success of linearization.
We demonstrate that these re parameterization invariance properties can be extended to the original neural network predictive.
arXiv Detail & Related papers (2024-06-05T14:49:15Z) - The Empirical Impact of Neural Parameter Symmetries, or Lack Thereof [50.49582712378289]
We investigate the impact of neural parameter symmetries by introducing new neural network architectures.
We develop two methods, with some provable guarantees, of modifying standard neural networks to reduce parameter space symmetries.
Our experiments reveal several interesting observations on the empirical impact of parameter symmetries.
arXiv Detail & Related papers (2024-05-30T16:32:31Z) - Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs)
Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators.
Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z) - Symmetry Breaking and Equivariant Neural Networks [17.740760773905986]
We introduce a novel notion of'relaxed equiinjection'
We show how to incorporate this relaxation into equivariant multilayer perceptronrons (E-MLPs)
The relevance of symmetry breaking is then discussed in various application domains.
arXiv Detail & Related papers (2023-12-14T15:06:48Z) - Learning Probabilistic Symmetrization for Architecture Agnostic Equivariance [16.49488981364657]
We present a novel framework to overcome the limitations of equivariant architectures in learning functions with group symmetries.
We use an arbitrary base model such as anvariant or a transformer and symmetrize it to be equivariant to the given group.
Empirical tests show competitive results against tailored equivariant architectures.
arXiv Detail & Related papers (2023-06-05T13:40:54Z) - Oracle-Preserving Latent Flows [58.720142291102135]
We develop a methodology for the simultaneous discovery of multiple nontrivial continuous symmetries across an entire labelled dataset.
The symmetry transformations and the corresponding generators are modeled with fully connected neural networks trained with a specially constructed loss function.
The two new elements in this work are the use of a reduced-dimensionality latent space and the generalization to transformations invariant with respect to high-dimensional oracles.
arXiv Detail & Related papers (2023-02-02T00:13:32Z) - Equivariant neural networks for inverse problems [1.7942265700058986]
We show that group equivariant convolutional operations can naturally be incorporated into learned reconstruction methods.
We design learned iterative methods in which the proximal operators are modelled as group equivariant convolutional neural networks.
arXiv Detail & Related papers (2021-02-23T05:38:41Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.