Related papers: Relaxed Equivariance via Multitask Learning

Relaxed Equivariance via Multitask Learning

URL: http://arxiv.org/abs/2410.17878v2
Date: Fri, 24 Jan 2025 16:19:45 GMT
Title: Relaxed Equivariance via Multitask Learning
Authors: Ahmed A. Elhag, T. Konstantin Rusch, Francesco Di Giovanni, Michael Bronstein,
Abstract summary: We introduce REMUL, a training procedure for approximating equivariance with multitask learning.<n>We show that unconstrained models can learn approximate symmetries by minimizing an additional simple equivariance loss.<n>Our method achieves competitive performance compared to equivariant baselines while being $10 times$ faster at inference and $2.5 times$ at training.
Score: 7.905957228045955
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Incorporating equivariance as an inductive bias into deep learning architectures to take advantage of the data symmetry has been successful in multiple applications, such as chemistry and dynamical systems. In particular, roto-translations are crucial for effectively modeling geometric graphs and molecules, where understanding the 3D structures enhances generalization. However, equivariant models often pose challenges due to their high computational complexity. In this paper, we introduce REMUL, a training procedure for approximating equivariance with multitask learning. We show that unconstrained models (which do not build equivariance into the architecture) can learn approximate symmetries by minimizing an additional simple equivariance loss. By formulating equivariance as a new learning objective, we can control the level of approximate equivariance in the model. Our method achieves competitive performance compared to equivariant baselines while being $10 \times$ faster at inference and $2.5 \times$ at training.

Related papers

Do we need equivariant models for molecule generation? [2.336105667374686]
We investigate whether non-equivariant convolutional neural networks (CNNs) trained with rotation augmentations can learn equivariance and match the performance of equivariant models.<n>To our knowledge, this is the first study to analyze learned equivariance in generative tasks.
arXiv Detail & Related papers (2025-07-13T19:16:11Z)
Large Language-Geometry Model: When LLM meets Equivariance [53.8505081745406]
We propose EquiLLM, a novel framework for representing 3D physical systems. We show that EquiLLM delivers significant improvements over previous methods across molecular dynamics simulation, human motion simulation, and antibody design.
arXiv Detail & Related papers (2025-02-16T14:50:49Z)
On the Utility of Equivariance and Symmetry Breaking in Deep Learning Architectures on Point Clouds [1.4079337353605066]
This paper explores the key factors that influence the performance of models working with point clouds. We identify the key aspects of equivariant and non-equivariant architectures that drive success in different tasks.
arXiv Detail & Related papers (2025-01-01T07:00:41Z)
Approximate Equivariance in Reinforcement Learning [35.04248486334824]
We develop approximately equivariant algorithms in reinforcement learning. Results show that the approximately equivariant network performs on par with exactly equivariant networks when exact symmetries are present.
arXiv Detail & Related papers (2024-11-06T19:44:46Z)
Does equivariance matter at scale? [15.247352029530523]
We study how equivariant and non-equivariant networks scale with compute and training samples. First, equivariance improves data efficiency, but training non-equivariant models with data augmentation can close this gap given sufficient epochs. Second, scaling with compute follows a power law, with equivariant models outperforming non-equivariant ones at each tested compute budget.
arXiv Detail & Related papers (2024-10-30T16:36:59Z)
Learning Layer-wise Equivariances Automatically using Gradients [66.81218780702125]
Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance. symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not be adapted. Our goal is to allow flexible symmetry constraints that can automatically be learned from data using gradients.
arXiv Detail & Related papers (2023-10-09T20:22:43Z)
In What Ways Are Deep Neural Networks Invariant and How Should We Measure This? [5.757836174655293]
We introduce a family of invariance and equivariance metrics that allows us to quantify these properties in a way that disentangles them from other metrics such as loss or accuracy. We draw a range of conclusions about invariance and equivariance in deep learning models, ranging from whether initializing a model with pretrained weights has an effect on a trained model's invariance, to the extent to which invariance learned via training can generalize to out-of-distribution data.
arXiv Detail & Related papers (2022-10-07T18:43:21Z)
The Lie Derivative for Measuring Learned Equivariance [84.29366874540217]
We study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures. We find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities. For example, transformers can be more equivariant than convolutional neural networks after training.
arXiv Detail & Related papers (2022-10-06T15:20:55Z)
HyperInvariances: Amortizing Invariance Learning [10.189246340672245]
Invariance learning is expensive and data intensive for popular neural architectures. We introduce the notion of amortizing invariance learning. This framework can identify appropriate invariances in different downstream tasks and lead to comparable or better test performance.
arXiv Detail & Related papers (2022-07-17T21:40:37Z)
Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data. Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes. Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z)
Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers. We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z)
Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters. We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z)
Meta-Learning Symmetries by Reparameterization [63.85144439337671]
We present a method for learning and encoding equivariances into networks by learning corresponding parameter sharing patterns from data. Our experiments suggest that it can automatically learn to encode equivariances to common transformations used in image processing tasks.
arXiv Detail & Related papers (2020-07-06T17:59:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.