Related papers: Learning Layer-wise Equivariances Automatically using Gradients

Learning Layer-wise Equivariances Automatically using Gradients

URL: http://arxiv.org/abs/2310.06131v1
Date: Mon, 9 Oct 2023 20:22:43 GMT
Title: Learning Layer-wise Equivariances Automatically using Gradients
Authors: Tycho F.A. van der Ouderaa, Alexander Immer, Mark van der Wilk
Abstract summary: Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance. symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not be adapted. Our goal is to allow flexible symmetry constraints that can automatically be learned from data using gradients.
Score: 66.81218780702125
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance. However, symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not be adapted. Our goal is to allow flexible symmetry constraints that can automatically be learned from data using gradients. Learning symmetry and associated weight connectivity structures from scratch is difficult for two reasons. First, it requires efficient and flexible parameterisations of layer-wise equivariances. Secondly, symmetries act as constraints and are therefore not encouraged by training losses measuring data fit. To overcome these challenges, we improve parameterisations of soft equivariance and learn the amount of equivariance in layers by optimising the marginal likelihood, estimated using differentiable Laplace approximations. The objective balances data fit and model complexity enabling layer-wise symmetry discovery in deep networks. We demonstrate the ability to automatically learn layer-wise equivariances on image classification tasks, achieving equivalent or improved performance over baselines with hard-coded symmetry.

Related papers

Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture. Non-smooth regularization is often incorporated into machine learning tasks. We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
Learning Broken Symmetries with Approximate Invariance [1.0485739694839669]
In many cases, the exact underlying symmetry is present only in an idealized dataset, and is broken in actual data. Standard approaches, such as data augmentation or equivariant networks fail to represent the nature of the full, broken symmetry. We propose a learning model which balances the generality and performance of unconstrained networks with the rapid learning of constrained networks.
arXiv Detail & Related papers (2024-12-25T04:29:04Z)
Approximate Equivariance in Reinforcement Learning [35.04248486334824]
Equivariant neural networks have shown great success in reinforcement learning. In many problems, only approximate symmetry is present, which makes imposing exact symmetry inappropriate. We develop approximately equivariant algorithms in reinforcement learning.
arXiv Detail & Related papers (2024-11-06T19:44:46Z)
Symmetry Discovery for Different Data Types [52.2614860099811]
Equivariant neural networks incorporate symmetries into their architecture, achieving higher generalization performance. We propose LieSD, a method for discovering symmetries via trained neural networks which approximate the input-output mappings of the tasks. We validate the performance of LieSD on tasks with symmetries such as the two-body problem, the moment of inertia matrix prediction, and top quark tagging.
arXiv Detail & Related papers (2024-10-13T13:39:39Z)
Enhancing lattice kinetic schemes for fluid dynamics with Lattice-Equivariant Neural Networks [79.16635054977068]
We present a new class of equivariant neural networks, dubbed Lattice-Equivariant Neural Networks (LENNs) Our approach develops within a recently introduced framework aimed at learning neural network-based surrogate models Lattice Boltzmann collision operators. Our work opens towards practical utilization of machine learning-augmented Lattice Boltzmann CFD in real-world simulations.
arXiv Detail & Related papers (2024-05-22T17:23:15Z)
Symmetry-guided gradient descent for quantum neural networks [5.170906880400192]
We formulate the symmetry constraints into a concise mathematical form. We design two ways to adopt the constraints into the cost function. We call the method symmetry-guided gradient descent (SGGD)
arXiv Detail & Related papers (2024-04-09T08:19:33Z)
Model Merging by Uncertainty-Based Gradient Matching [70.54580972266096]
We propose a new uncertainty-based scheme to improve the performance by reducing the mismatch. Our new method gives consistent improvements for large language models and vision transformers.
arXiv Detail & Related papers (2023-10-19T15:02:45Z)
Optimization Dynamics of Equivariant and Augmented Neural Networks [2.7918308693131135]
We investigate the optimization of neural networks on symmetric data. We compare the strategy of constraining the architecture to be equivariant to that of using data augmentation. Our analysis reveals that even in the latter situation, stationary points may be unstable for augmented training although they are stable for the manifestly equivariant models.
arXiv Detail & Related papers (2023-03-23T17:26:12Z)
The Surprising Effectiveness of Equivariant Models in Domains with Latent Symmetry [6.716931832076628]
We show that imposing symmetry constraints that do not exactly match the domain symmetry is very helpful in learning the true symmetry in the environment. We demonstrate that an equivariant model can significantly outperform non-equivariant methods on domains with latent symmetries both in supervised learning and in reinforcement learning for robotic manipulation and control problems.
arXiv Detail & Related papers (2022-11-16T21:51:55Z)
Symmetry-driven graph neural networks [1.713291434132985]
We introduce two graph network architectures that are equivariant to several types of transformations affecting the node coordinates. We demonstrate these capabilities on a synthetic dataset composed of $n$-dimensional geometric objects.
arXiv Detail & Related papers (2021-05-28T18:54:12Z)
Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters. We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z)
Meta-Learning Symmetries by Reparameterization [63.85144439337671]
We present a method for learning and encoding equivariances into networks by learning corresponding parameter sharing patterns from data. Our experiments suggest that it can automatically learn to encode equivariances to common transformations used in image processing tasks.
arXiv Detail & Related papers (2020-07-06T17:59:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.