Related papers: Learning Invariant Weights in Neural Networks

Learning Invariant Weights in Neural Networks

URL: http://arxiv.org/abs/2202.12439v1
Date: Fri, 25 Feb 2022 00:17:09 GMT
Title: Learning Invariant Weights in Neural Networks
Authors: Tycho F.A. van der Ouderaa and Mark van der Wilk
Abstract summary: Many commonly used models in machine learning are constraint to respect certain symmetries in the data. We propose a weight-space equivalent to this approach, by minimizing a lower bound on the marginal likelihood to learn invariances in neural networks.
Score: 16.127299898156203
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Assumptions about invariances or symmetries in data can significantly increase the predictive power of statistical models. Many commonly used models in machine learning are constraint to respect certain symmetries in the data, such as translation equivariance in convolutional neural networks, and incorporation of new symmetry types is actively being studied. Yet, efforts to learn such invariances from the data itself remains an open research problem. It has been shown that marginal likelihood offers a principled way to learn invariances in Gaussian Processes. We propose a weight-space equivalent to this approach, by minimizing a lower bound on the marginal likelihood to learn invariances in neural networks resulting in naturally higher performing models.

Related papers

Are Statistical Methods Obsolete in the Era of Deep Learning? [0.8329456268842228]
In the era of AI, neural networks have become increasingly popular for modeling, inference, and prediction.<n>With the proliferation of such deep learning models, a question arises: are leaner statistical methods still relevant?<n>We show that statistical methods are far from obsolete, especially when working with sparse and noisy observations.
arXiv Detail & Related papers (2025-05-27T20:11:21Z)
Diagonal Symmetrization of Neural Network Solvers for the Many-Electron Schrödinger Equation [11.202098800341096]
We study different ways of incorporating diagonal invariance in neural network ans"atze trained via variational Monte Carlo methods. We show that, contrary to standard ML setups, in-training symmetrization destabilizes training and can lead to worse performance. Our theoretical and numerical results indicate that this unexpected behavior may arise from a unique computational-statistical tradeoff not found in standard ML analyses of symmetrization.
arXiv Detail & Related papers (2025-02-07T20:37:25Z)
Symmetry Discovery for Different Data Types [52.2614860099811]
Equivariant neural networks incorporate symmetries into their architecture, achieving higher generalization performance. We propose LieSD, a method for discovering symmetries via trained neural networks which approximate the input-output mappings of the tasks. We validate the performance of LieSD on tasks with symmetries such as the two-body problem, the moment of inertia matrix prediction, and top quark tagging.
arXiv Detail & Related papers (2024-10-13T13:39:39Z)
A Probabilistic Approach to Learning the Degree of Equivariance in Steerable CNNs [5.141137421503899]
Steerable convolutional neural networks (SCNNs) enhance task performance by modelling geometric symmetries. Yet, unknown or varying symmetries can lead to overconstrained weights and decreased performance. This paper introduces a probabilistic method to learn the degree of equivariance in SCNNs.
arXiv Detail & Related papers (2024-06-06T10:45:19Z)
Scaling and renormalization in high-dimensional regression [72.59731158970894]
We present a unifying perspective on recent results on ridge regression.<n>We use the basic tools of random matrix theory and free probability, aimed at readers with backgrounds in physics and deep learning.<n>Our results extend and provide a unifying perspective on earlier models of scaling laws.
arXiv Detail & Related papers (2024-05-01T15:59:00Z)
MMGP: a Mesh Morphing Gaussian Process-based machine learning method for regression of physical problems under non-parameterized geometrical variability [0.30693357740321775]
We propose a machine learning method that do not rely on graph neural networks. The proposed methodology can easily deal with large meshes without the need for explicit shape parameterization.
arXiv Detail & Related papers (2023-05-22T09:50:15Z)
Learning Low Dimensional State Spaces with Overparameterized Recurrent Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory. Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z)
Equivariance Allows Handling Multiple Nuisance Variables When Analyzing Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution. We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z)
Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers. We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z)
Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters. We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z)
Bayesian neural networks and dimensionality reduction [4.039245878626346]
A class of model-based approaches for such problems includes latent variables in an unknown non-linear regression function. VAEs are artificial neural networks (ANNs) that employ approximations to make computation tractable. We deploy Markov chain Monte Carlo sampling algorithms for Bayesian inference in ANN models with latent variables.
arXiv Detail & Related papers (2020-08-18T17:11:07Z)
Multiplicative noise and heavy tails in stochastic optimization [62.993432503309485]
empirical optimization is central to modern machine learning, but its role in its success is still unclear. We show that it commonly arises in parameters of discrete multiplicative noise due to variance. A detailed analysis is conducted in which we describe on key factors, including recent step size, and data, all exhibit similar results on state-of-the-art neural network models.
arXiv Detail & Related papers (2020-06-11T09:58:01Z)
'Place-cell' emergence and learning of invariant data with restricted Boltzmann machines: breaking and dynamical restoration of continuous symmetries in the weight space [0.0]
We study the learning dynamics of Restricted Boltzmann Machines (RBM), a neural network paradigm for representation learning. As learning proceeds from a random configuration of the network weights, we show the existence of a symmetry-breaking phenomenon. This symmetry-breaking phenomenon takes place only if the amount of data available for training exceeds some critical value.
arXiv Detail & Related papers (2019-12-30T14:37:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.