Improving Equivariant Model Training via Constraint Relaxation
- URL: http://arxiv.org/abs/2408.13242v1
- Date: Fri, 23 Aug 2024 17:35:08 GMT
- Title: Improving Equivariant Model Training via Constraint Relaxation
- Authors: Stefanos Pertigkiozoglou, Evangelos Chatzipantazis, Shubhendu Trivedi, Kostas Daniilidis,
- Abstract summary: Equivariant neural networks have been widely used in a variety of applications due to their ability to generalize well in tasks where the underlying data symmetries are known.
We propose a novel framework for improving the optimization of such models by relaxing the hard equivariance constraint during training.
We provide experimental results on different state-of-the-art network architectures, demonstrating how this training framework can result in equivariant models with improved generalization performance.
- Score: 31.507956579770088
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Equivariant neural networks have been widely used in a variety of applications due to their ability to generalize well in tasks where the underlying data symmetries are known. Despite their successes, such networks can be difficult to optimize and require careful hyperparameter tuning to train successfully. In this work, we propose a novel framework for improving the optimization of such models by relaxing the hard equivariance constraint during training: We relax the equivariance constraint of the network's intermediate layers by introducing an additional non-equivariance term that we progressively constrain until we arrive at an equivariant solution. By controlling the magnitude of the activation of the additional relaxation term, we allow the model to optimize over a larger hypothesis space containing approximate equivariant networks and converge back to an equivariant solution at the end of training. We provide experimental results on different state-of-the-art network architectures, demonstrating how this training framework can result in equivariant models with improved generalization performance.
Related papers
- Equivariant Ensembles and Regularization for Reinforcement Learning in Map-based Path Planning [5.69473229553916]
This paper proposes a method to construct equivariant policies and invariant value functions without specialized neural network components.
We show how equivariant ensembles and regularization benefit sample efficiency and performance.
arXiv Detail & Related papers (2024-03-19T16:01:25Z) - Equivariant Adaptation of Large Pretrained Models [20.687626756753563]
We show that a canonicalization network can effectively be used to make a large pretrained network equivariant.
Using dataset-dependent priors to inform the canonicalization function, we are able to make large pretrained models equivariant while maintaining their performance.
arXiv Detail & Related papers (2023-10-02T21:21:28Z) - Optimization Dynamics of Equivariant and Augmented Neural Networks [2.7918308693131135]
We investigate the optimization of neural networks on symmetric data.
We compare the strategy of constraining the architecture to be equivariant to that of using data augmentation.
Our analysis reveals that even in the latter situation, stationary points may be unstable for augmented training although they are stable for the manifestly equivariant models.
arXiv Detail & Related papers (2023-03-23T17:26:12Z) - Deep Neural Networks with Efficient Guaranteed Invariances [77.99182201815763]
We address the problem of improving the performance and in particular the sample complexity of deep neural networks.
Group-equivariant convolutions are a popular approach to obtain equivariant representations.
We propose a multi-stream architecture, where each stream is invariant to a different transformation.
arXiv Detail & Related papers (2023-03-02T20:44:45Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - Architectural Optimization over Subgroups for Equivariant Neural
Networks [0.0]
We propose equivariance relaxation morphism and $[G]$-mixed equivariant layer to operate with equivariance constraints on a subgroup.
We present evolutionary and differentiable neural architecture search (NAS) algorithms that utilize these mechanisms respectively for equivariance-aware architectural optimization.
arXiv Detail & Related papers (2022-10-11T14:37:29Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers.
We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z) - Automatically Learning Compact Quality-aware Surrogates for Optimization
Problems [55.94450542785096]
Solving optimization problems with unknown parameters requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values.
Recent work has shown that including the optimization problem as a layer in a complex training model pipeline results in predictions of iteration of unobserved decision making.
We show that we can improve solution quality by learning a low-dimensional surrogate model of a large optimization problem.
arXiv Detail & Related papers (2020-06-18T19:11:54Z) - Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand.
We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice.
Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.