Improving Equivariant Model Training via Constraint Relaxation
- URL: http://arxiv.org/abs/2408.13242v1
- Date: Fri, 23 Aug 2024 17:35:08 GMT
- Title: Improving Equivariant Model Training via Constraint Relaxation
- Authors: Stefanos Pertigkiozoglou, Evangelos Chatzipantazis, Shubhendu Trivedi, Kostas Daniilidis,
- Abstract summary: Equivariant neural networks have been widely used in a variety of applications due to their ability to generalize well in tasks where the underlying data symmetries are known.
We propose a novel framework for improving the optimization of such models by relaxing the hard equivariance constraint during training.
We provide experimental results on different state-of-the-art network architectures, demonstrating how this training framework can result in equivariant models with improved generalization performance.
- Score: 31.507956579770088
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Equivariant neural networks have been widely used in a variety of applications due to their ability to generalize well in tasks where the underlying data symmetries are known. Despite their successes, such networks can be difficult to optimize and require careful hyperparameter tuning to train successfully. In this work, we propose a novel framework for improving the optimization of such models by relaxing the hard equivariance constraint during training: We relax the equivariance constraint of the network's intermediate layers by introducing an additional non-equivariance term that we progressively constrain until we arrive at an equivariant solution. By controlling the magnitude of the activation of the additional relaxation term, we allow the model to optimize over a larger hypothesis space containing approximate equivariant networks and converge back to an equivariant solution at the end of training. We provide experimental results on different state-of-the-art network architectures, demonstrating how this training framework can result in equivariant models with improved generalization performance.
Related papers
- Approximate Equivariance in Reinforcement Learning [35.04248486334824]
We develop approximately equivariant algorithms in reinforcement learning.
Results show that the approximately equivariant network performs on par with exactly equivariant networks when exact symmetries are present.
arXiv Detail & Related papers (2024-11-06T19:44:46Z) - Equivariant Ensembles and Regularization for Reinforcement Learning in Map-based Path Planning [5.69473229553916]
This paper proposes a method to construct equivariant policies and invariant value functions without specialized neural network components.
We show how equivariant ensembles and regularization benefit sample efficiency and performance.
arXiv Detail & Related papers (2024-03-19T16:01:25Z) - Universal Neural Functionals [67.80283995795985]
A challenging problem in many modern machine learning tasks is to process weight-space features.
Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks.
This work proposes an algorithm that automatically constructs permutation equivariant models for any weight space.
arXiv Detail & Related papers (2024-02-07T20:12:27Z) - Equivariant Adaptation of Large Pretrained Models [20.687626756753563]
We show that a canonicalization network can effectively be used to make a large pretrained network equivariant.
Using dataset-dependent priors to inform the canonicalization function, we are able to make large pretrained models equivariant while maintaining their performance.
arXiv Detail & Related papers (2023-10-02T21:21:28Z) - Optimization Dynamics of Equivariant and Augmented Neural Networks [2.7918308693131135]
We investigate the optimization of neural networks on symmetric data.
We compare the strategy of constraining the architecture to be equivariant to that of using data augmentation.
Our analysis reveals that even in the latter situation, stationary points may be unstable for augmented training although they are stable for the manifestly equivariant models.
arXiv Detail & Related papers (2023-03-23T17:26:12Z) - Deep Neural Networks with Efficient Guaranteed Invariances [77.99182201815763]
We address the problem of improving the performance and in particular the sample complexity of deep neural networks.
Group-equivariant convolutions are a popular approach to obtain equivariant representations.
We propose a multi-stream architecture, where each stream is invariant to a different transformation.
arXiv Detail & Related papers (2023-03-02T20:44:45Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - Architectural Optimization over Subgroups for Equivariant Neural
Networks [0.0]
We propose equivariance relaxation morphism and $[G]$-mixed equivariant layer to operate with equivariance constraints on a subgroup.
We present evolutionary and differentiable neural architecture search (NAS) algorithms that utilize these mechanisms respectively for equivariance-aware architectural optimization.
arXiv Detail & Related papers (2022-10-11T14:37:29Z) - Relaxing Equivariance Constraints with Non-stationary Continuous Filters [20.74154804898478]
The proposed parameterization can be thought of as a building block to allow adjustable symmetry structure in neural networks.
Compared to non-equivariant or strict-equivariant baselines, we experimentally verify that soft equivariance leads to improved performance in terms of test accuracy on CIFAR-10 and CIFAR-100 image classification tasks.
arXiv Detail & Related papers (2022-04-14T18:08:36Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - A Flexible Framework for Designing Trainable Priors with Adaptive
Smoothing and Game Encoding [57.1077544780653]
We introduce a general framework for designing and training neural network layers whose forward passes can be interpreted as solving non-smooth convex optimization problems.
We focus on convex games, solved by local agents represented by the nodes of a graph and interacting through regularization functions.
This approach is appealing for solving imaging problems, as it allows the use of classical image priors within deep models that are trainable end to end.
arXiv Detail & Related papers (2020-06-26T08:34:54Z) - Automatically Learning Compact Quality-aware Surrogates for Optimization
Problems [55.94450542785096]
Solving optimization problems with unknown parameters requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values.
Recent work has shown that including the optimization problem as a layer in a complex training model pipeline results in predictions of iteration of unobserved decision making.
We show that we can improve solution quality by learning a low-dimensional surrogate model of a large optimization problem.
arXiv Detail & Related papers (2020-06-18T19:11:54Z) - Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand.
We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice.
Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.