Related papers: Improving Equivariant Model Training via Constraint Relaxation

Improving Equivariant Model Training via Constraint Relaxation

URL: http://arxiv.org/abs/2408.13242v2
Date: Thu, 02 Jan 2025 22:07:11 GMT
Title: Improving Equivariant Model Training via Constraint Relaxation
Authors: Stefanos Pertigkiozoglou, Evangelos Chatzipantazis, Shubhendu Trivedi, Kostas Daniilidis,
Abstract summary: We propose a novel framework for improving the optimization of such models by relaxing the hard equivariance constraint during training.<n>We provide experimental results on different state-of-the-art network architectures, demonstrating how this training framework can result in equivariant models with improved generalization performance.
Score: 31.507956579770088
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Equivariant neural networks have been widely used in a variety of applications due to their ability to generalize well in tasks where the underlying data symmetries are known. Despite their successes, such networks can be difficult to optimize and require careful hyperparameter tuning to train successfully. In this work, we propose a novel framework for improving the optimization of such models by relaxing the hard equivariance constraint during training: We relax the equivariance constraint of the network's intermediate layers by introducing an additional non-equivariant term that we progressively constrain until we arrive at an equivariant solution. By controlling the magnitude of the activation of the additional relaxation term, we allow the model to optimize over a larger hypothesis space containing approximate equivariant networks and converge back to an equivariant solution at the end of training. We provide experimental results on different state-of-the-art network architectures, demonstrating how this training framework can result in equivariant models with improved generalization performance. Our code is available at https://github.com/StefanosPert/Equivariant_Optimization_CR

Related papers

Learning (Approximately) Equivariant Networks via Constrained Optimization [25.51476313302483]
Equivariant neural networks are designed to respect symmetries through their architecture.<n>Real-world data often departs from perfect symmetry because of noise, structural variation, measurement bias, or other symmetry-breaking effects.<n>We introduce Adaptive Constrained Equivariance (ACE), a constrained optimization approach that starts with a flexible, non-equivariant model.
arXiv Detail & Related papers (2025-05-19T18:08:09Z)
Probing Equivariance and Symmetry Breaking in Convolutional Networks [3.241907749876342]
We introduce textttRapidash, a unified group convolutional architecture that allows for different variants of equivariant and non-equivariant models.<n>Our results suggest that more constrained equivariant models outperform less constrained alternatives when aligned with the geometry of the task.
arXiv Detail & Related papers (2025-01-01T07:00:41Z)
Approximate Equivariance in Reinforcement Learning [35.04248486334824]
We develop approximately equivariant algorithms in reinforcement learning. Results show that the approximately equivariant network performs on par with exactly equivariant networks when exact symmetries are present.
arXiv Detail & Related papers (2024-11-06T19:44:46Z)
Equivariant Ensembles and Regularization for Reinforcement Learning in Map-based Path Planning [5.69473229553916]
This paper proposes a method to construct equivariant policies and invariant value functions without specialized neural network components. We show how equivariant ensembles and regularization benefit sample efficiency and performance.
arXiv Detail & Related papers (2024-03-19T16:01:25Z)
Universal Neural Functionals [67.80283995795985]
A challenging problem in many modern machine learning tasks is to process weight-space features. Recent works have developed promising weight-space models that are equivariant to the permutation symmetries of simple feedforward networks. This work proposes an algorithm that automatically constructs permutation equivariant models for any weight space.
arXiv Detail & Related papers (2024-02-07T20:12:27Z)
Equivariant Adaptation of Large Pretrained Models [20.687626756753563]
We show that a canonicalization network can effectively be used to make a large pretrained network equivariant. Using dataset-dependent priors to inform the canonicalization function, we are able to make large pretrained models equivariant while maintaining their performance.
arXiv Detail & Related papers (2023-10-02T21:21:28Z)
Optimization Dynamics of Equivariant and Augmented Neural Networks [2.7918308693131135]
We investigate the optimization of neural networks on symmetric data. We compare the strategy of constraining the architecture to be equivariant to that of using data augmentation. Our analysis reveals that even in the latter situation, stationary points may be unstable for augmented training although they are stable for the manifestly equivariant models.
arXiv Detail & Related papers (2023-03-23T17:26:12Z)
Deep Neural Networks with Efficient Guaranteed Invariances [77.99182201815763]
We address the problem of improving the performance and in particular the sample complexity of deep neural networks. Group-equivariant convolutions are a popular approach to obtain equivariant representations. We propose a multi-stream architecture, where each stream is invariant to a different transformation.
arXiv Detail & Related papers (2023-03-02T20:44:45Z)
Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks. One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver. This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z)
Architectural Optimization over Subgroups for Equivariant Neural Networks [0.0]
We propose equivariance relaxation morphism and $[G]$-mixed equivariant layer to operate with equivariance constraints on a subgroup. We present evolutionary and differentiable neural architecture search (NAS) algorithms that utilize these mechanisms respectively for equivariance-aware architectural optimization.
arXiv Detail & Related papers (2022-10-11T14:37:29Z)
Relaxing Equivariance Constraints with Non-stationary Continuous Filters [20.74154804898478]
The proposed parameterization can be thought of as a building block to allow adjustable symmetry structure in neural networks. Compared to non-equivariant or strict-equivariant baselines, we experimentally verify that soft equivariance leads to improved performance in terms of test accuracy on CIFAR-10 and CIFAR-100 image classification tasks.
arXiv Detail & Related papers (2022-04-14T18:08:36Z)
Improving the Sample-Complexity of Deep Classification Networks with Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks. We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems. We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z)
Equivariant vector field network for many-body system modeling [65.22203086172019]
Equivariant Vector Field Network (EVFN) is built on a novel equivariant basis and the associated scalarization and vectorization layers. We evaluate our method on predicting trajectories of simulated Newton mechanics systems with both full and partially observed data.
arXiv Detail & Related papers (2021-10-26T14:26:25Z)
A Flexible Framework for Designing Trainable Priors with Adaptive Smoothing and Game Encoding [57.1077544780653]
We introduce a general framework for designing and training neural network layers whose forward passes can be interpreted as solving non-smooth convex optimization problems. We focus on convex games, solved by local agents represented by the nodes of a graph and interacting through regularization functions. This approach is appealing for solving imaging problems, as it allows the use of classical image priors within deep models that are trainable end to end.
arXiv Detail & Related papers (2020-06-26T08:34:54Z)
Automatically Learning Compact Quality-aware Surrogates for Optimization Problems [55.94450542785096]
Solving optimization problems with unknown parameters requires learning a predictive model to predict the values of the unknown parameters and then solving the problem using these values. Recent work has shown that including the optimization problem as a layer in a complex training model pipeline results in predictions of iteration of unobserved decision making. We show that we can improve solution quality by learning a low-dimensional surrogate model of a large optimization problem.
arXiv Detail & Related papers (2020-06-18T19:11:54Z)
Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand. We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice. Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.