Related papers: Learning (Approximately) Equivariant Networks via Constrained Optimization

Learning (Approximately) Equivariant Networks via Constrained Optimization

URL: http://arxiv.org/abs/2505.13631v1
Date: Mon, 19 May 2025 18:08:09 GMT
Title: Learning (Approximately) Equivariant Networks via Constrained Optimization
Authors: Andrei Manolache, Luiz F. O. Chamon, Mathias Niepert,
Abstract summary: Equivariant neural networks are designed to respect symmetries through their architecture.<n>Real-world data often departs from perfect symmetry because of noise, structural variation, measurement bias, or other symmetry-breaking effects.<n>We introduce Adaptive Constrained Equivariance (ACE), a constrained optimization approach that starts with a flexible, non-equivariant model.
Score: 25.51476313302483
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Equivariant neural networks are designed to respect symmetries through their architecture, boosting generalization and sample efficiency when those symmetries are present in the data distribution. Real-world data, however, often departs from perfect symmetry because of noise, structural variation, measurement bias, or other symmetry-breaking effects. Strictly equivariant models may struggle to fit the data, while unconstrained models lack a principled way to leverage partial symmetries. Even when the data is fully symmetric, enforcing equivariance can hurt training by limiting the model to a restricted region of the parameter space. Guided by homotopy principles, where an optimization problem is solved by gradually transforming a simpler problem into a complex one, we introduce Adaptive Constrained Equivariance (ACE), a constrained optimization approach that starts with a flexible, non-equivariant model and gradually reduces its deviation from equivariance. This gradual tightening smooths training early on and settles the model at a data-driven equilibrium, balancing between equivariance and non-equivariance. Across multiple architectures and tasks, our method consistently improves performance metrics, sample efficiency, and robustness to input perturbations compared with strictly equivariant models and heuristic equivariance relaxations.

Related papers

Improving Equivariant Networks with Probabilistic Symmetry Breaking [9.164167226137664]
Equivariant networks encode known symmetries into neural networks, often enhancing generalizations.<n>This poses an important problem, both (1) for prediction tasks on domains where self-symmetries are common, and (2) for generative models, which must break symmetries in order to reconstruct from highly symmetric latent spaces.<n>We present novel theoretical results that establish sufficient conditions for representing such distributions.
arXiv Detail & Related papers (2025-03-27T21:04:49Z)
Rao-Blackwell Gradient Estimators for Equivariant Denoising Diffusion [41.50816120270017]
In domains such as molecular and protein generation, physical systems exhibit inherent symmetries that are critical to model.<n>We present a framework that reduces training variance and provides a provably lower-variance gradient estimator.<n>We also present a practical implementation of this estimator incorporating the loss and sampling procedure through a method we call Orbit Diffusion.
arXiv Detail & Related papers (2025-02-14T03:26:57Z)
Approximate Equivariance in Reinforcement Learning [35.04248486334824]
We develop approximately equivariant algorithms in reinforcement learning.<n>Results show that the approximately equivariant network performs on par with exactly equivariant networks when exact symmetries are present.
arXiv Detail & Related papers (2024-11-06T19:44:46Z)
Approximately Equivariant Neural Processes [47.14384085714576]
When modelling real-world data, learning problems are often not exactly equivariant, but only approximately. Current approaches to achieving this cannot usually be applied out-of-the-box to any architecture and symmetry group. We develop a general approach to achieving this using existing equivariant architectures.
arXiv Detail & Related papers (2024-06-19T12:17:14Z)
Scaling and renormalization in high-dimensional regression [72.59731158970894]
We present a unifying perspective on recent results on ridge regression.<n>We use the basic tools of random matrix theory and free probability, aimed at readers with backgrounds in physics and deep learning.<n>Our results extend and provide a unifying perspective on earlier models of scaling laws.
arXiv Detail & Related papers (2024-05-01T15:59:00Z)
Symmetry Breaking and Equivariant Neural Networks [17.740760773905986]
We introduce a novel notion of'relaxed equiinjection' We show how to incorporate this relaxation into equivariant multilayer perceptronrons (E-MLPs) The relevance of symmetry breaking is then discussed in various application domains.
arXiv Detail & Related papers (2023-12-14T15:06:48Z)
Learning Layer-wise Equivariances Automatically using Gradients [66.81218780702125]
Convolutions encode equivariance symmetries into neural networks leading to better generalisation performance. symmetries provide fixed hard constraints on the functions a network can represent, need to be specified in advance, and can not be adapted. Our goal is to allow flexible symmetry constraints that can automatically be learned from data using gradients.
arXiv Detail & Related papers (2023-10-09T20:22:43Z)
The Surprising Effectiveness of Equivariant Models in Domains with Latent Symmetry [6.716931832076628]
We show that imposing symmetry constraints that do not exactly match the domain symmetry is very helpful in learning the true symmetry in the environment. We demonstrate that an equivariant model can significantly outperform non-equivariant methods on domains with latent symmetries both in supervised learning and in reinforcement learning for robotic manipulation and control problems.
arXiv Detail & Related papers (2022-11-16T21:51:55Z)
Relaxing Equivariance Constraints with Non-stationary Continuous Filters [20.74154804898478]
The proposed parameterization can be thought of as a building block to allow adjustable symmetry structure in neural networks. Compared to non-equivariant or strict-equivariant baselines, we experimentally verify that soft equivariance leads to improved performance in terms of test accuracy on CIFAR-10 and CIFAR-100 image classification tasks.
arXiv Detail & Related papers (2022-04-14T18:08:36Z)
Approximately Equivariant Networks for Imperfectly Symmetric Dynamics [24.363954435050264]
We find that our models can outperform both baselines with no symmetry bias and baselines with overly strict symmetry in both simulated turbulence domains and real-world multi-stream jet flow.
arXiv Detail & Related papers (2022-01-28T07:31:28Z)
Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers. We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z)
Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters. We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z)
Accounting for Unobserved Confounding in Domain Generalization [107.0464488046289]
This paper investigates the problem of learning robust, generalizable prediction models from a combination of datasets. Part of the challenge of learning robust models lies in the influence of unobserved confounders. We demonstrate the empirical performance of our approach on healthcare data from different modalities.
arXiv Detail & Related papers (2020-07-21T08:18:06Z)

This list is automatically generated from the titles and abstracts of the papers in this site.