A Bayesian Approach to Invariant Deep Neural Networks
- URL: http://arxiv.org/abs/2107.09301v1
- Date: Tue, 20 Jul 2021 07:33:58 GMT
- Title: A Bayesian Approach to Invariant Deep Neural Networks
- Authors: Nikolaos Mourdoukoutas, Marco Federici, Georges Pantalos, Mark van der
Wilk and Vincent Fortuin
- Abstract summary: We show that our model outperforms other non-invariant architectures, when trained on datasets that contain specific invariances.
The same holds true when no data augmentation is performed.
- Score: 14.807284992678762
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a novel Bayesian neural network architecture that can learn
invariances from data alone by inferring a posterior distribution over
different weight-sharing schemes. We show that our model outperforms other
non-invariant architectures, when trained on datasets that contain specific
invariances. The same holds true when no data augmentation is performed.
Related papers
- Improved Generalization of Weight Space Networks via Augmentations [53.87011906358727]
Learning in deep weight spaces (DWS) is an emerging research direction, with applications to 2D and 3D neural fields (INRs, NeRFs)
We empirically analyze the reasons for this overfitting and find that a key reason is the lack of diversity in DWS datasets.
To address this, we explore strategies for data augmentation in weight spaces and propose a MixUp method adapted for weight spaces.
arXiv Detail & Related papers (2024-02-06T15:34:44Z) - Unveiling Invariances via Neural Network Pruning [44.47186380630998]
Invariance describes transformations that do not alter data's underlying semantics.
Modern networks are handcrafted to handle well-known invariances.
We propose a framework to learn novel network architectures that capture data-dependent invariances via pruning.
arXiv Detail & Related papers (2023-09-15T05:38:33Z) - Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction.
Our approach is capable of encoding neural networks in a model zoo of mixed architecture.
We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z) - Federated Variational Inference Methods for Structured Latent Variable
Models [1.0312968200748118]
Federated learning methods enable model training across distributed data sources without data leaving their original locations.
We present a general and elegant solution based on structured variational inference, widely used in Bayesian machine learning.
We also provide a communication-efficient variant analogous to the canonical FedAvg algorithm.
arXiv Detail & Related papers (2023-02-07T08:35:04Z) - The Lie Derivative for Measuring Learned Equivariance [84.29366874540217]
We study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures.
We find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities.
For example, transformers can be more equivariant than convolutional neural networks after training.
arXiv Detail & Related papers (2022-10-06T15:20:55Z) - Variational Neural Networks [88.24021148516319]
We propose a method for uncertainty estimation in neural networks called Variational Neural Network (VNN)
VNN generates parameters for the output distribution of a layer by transforming its inputs with learnable sub-layers.
In uncertainty quality estimation experiments, we show that VNNs achieve better uncertainty quality than Monte Carlo Dropout or Bayes By Backpropagation methods.
arXiv Detail & Related papers (2022-07-04T15:41:02Z) - Equivariance versus Augmentation for Spherical Images [0.7388859384645262]
We analyze the role of rotational equivariance in convolutional neural networks (CNNs) applied to spherical images.
We compare the performance of the group equivariant networks known as S2CNNs and standard non-equivariant CNNs trained with an increasing amount of data augmentation.
arXiv Detail & Related papers (2022-02-08T16:49:30Z) - Kalman Bayesian Neural Networks for Closed-form Online Learning [5.220940151628734]
We propose a novel approach for BNN learning via closed-form Bayesian inference.
The calculation of the predictive distribution of the output and the update of the weight distribution are treated as Bayesian filtering and smoothing problems.
This allows closed-form expressions for training the network's parameters in a sequential/online fashion without gradient descent.
arXiv Detail & Related papers (2021-10-03T07:29:57Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z) - Diversity inducing Information Bottleneck in Model Ensembles [73.80615604822435]
In this paper, we target the problem of generating effective ensembles of neural networks by encouraging diversity in prediction.
We explicitly optimize a diversity inducing adversarial loss for learning latent variables and thereby obtain diversity in the output predictions necessary for modeling multi-modal data.
Compared to the most competitive baselines, we show significant improvements in classification accuracy, under a shift in the data distribution.
arXiv Detail & Related papers (2020-03-10T03:10:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.