Learning Invariant Weights in Neural Networks
- URL: http://arxiv.org/abs/2202.12439v1
- Date: Fri, 25 Feb 2022 00:17:09 GMT
- Title: Learning Invariant Weights in Neural Networks
- Authors: Tycho F.A. van der Ouderaa and Mark van der Wilk
- Abstract summary: Many commonly used models in machine learning are constraint to respect certain symmetries in the data.
We propose a weight-space equivalent to this approach, by minimizing a lower bound on the marginal likelihood to learn invariances in neural networks.
- Score: 16.127299898156203
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Assumptions about invariances or symmetries in data can significantly
increase the predictive power of statistical models. Many commonly used models
in machine learning are constraint to respect certain symmetries in the data,
such as translation equivariance in convolutional neural networks, and
incorporation of new symmetry types is actively being studied. Yet, efforts to
learn such invariances from the data itself remains an open research problem.
It has been shown that marginal likelihood offers a principled way to learn
invariances in Gaussian Processes. We propose a weight-space equivalent to this
approach, by minimizing a lower bound on the marginal likelihood to learn
invariances in neural networks resulting in naturally higher performing models.
Related papers
- Symmetry Discovery for Different Data Types [52.2614860099811]
Equivariant neural networks incorporate symmetries into their architecture, achieving higher generalization performance.
We propose LieSD, a method for discovering symmetries via trained neural networks which approximate the input-output mappings of the tasks.
We validate the performance of LieSD on tasks with symmetries such as the two-body problem, the moment of inertia matrix prediction, and top quark tagging.
arXiv Detail & Related papers (2024-10-13T13:39:39Z) - A Probabilistic Approach to Learning the Degree of Equivariance in Steerable CNNs [5.141137421503899]
Steerable convolutional neural networks (SCNNs) enhance task performance by modelling geometric symmetries.
Yet, unknown or varying symmetries can lead to overconstrained weights and decreased performance.
This paper introduces a probabilistic method to learn the degree of equivariance in SCNNs.
arXiv Detail & Related papers (2024-06-06T10:45:19Z) - MMGP: a Mesh Morphing Gaussian Process-based machine learning method for
regression of physical problems under non-parameterized geometrical
variability [0.30693357740321775]
We propose a machine learning method that do not rely on graph neural networks.
The proposed methodology can easily deal with large meshes without the need for explicit shape parameterization.
arXiv Detail & Related papers (2023-05-22T09:50:15Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z) - Bayesian neural networks and dimensionality reduction [4.039245878626346]
A class of model-based approaches for such problems includes latent variables in an unknown non-linear regression function.
VAEs are artificial neural networks (ANNs) that employ approximations to make computation tractable.
We deploy Markov chain Monte Carlo sampling algorithms for Bayesian inference in ANN models with latent variables.
arXiv Detail & Related papers (2020-08-18T17:11:07Z) - Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear.
We show that it commonly arises in parameters of discrete multiplicative noise due to variance.
A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z) - 'Place-cell' emergence and learning of invariant data with restricted
Boltzmann machines: breaking and dynamical restoration of continuous
symmetries in the weight space [0.0]
We study the learning dynamics of Restricted Boltzmann Machines (RBM), a neural network paradigm for representation learning.
As learning proceeds from a random configuration of the network weights, we show the existence of a symmetry-breaking phenomenon.
This symmetry-breaking phenomenon takes place only if the amount of data available for training exceeds some critical value.
arXiv Detail & Related papers (2019-12-30T14:37:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.