Equivariant Adaptation of Large Pretrained Models
- URL: http://arxiv.org/abs/2310.01647v2
- Date: Sun, 29 Oct 2023 13:46:45 GMT
- Title: Equivariant Adaptation of Large Pretrained Models
- Authors: Arnab Kumar Mondal, Siba Smarak Panigrahi, S\'ekou-Oumar Kaba, Sai
Rajeswar, Siamak Ravanbakhsh
- Abstract summary: We show that a canonicalization network can effectively be used to make a large pretrained network equivariant.
Using dataset-dependent priors to inform the canonicalization function, we are able to make large pretrained models equivariant while maintaining their performance.
- Score: 20.687626756753563
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Equivariant networks are specifically designed to ensure consistent behavior
with respect to a set of input transformations, leading to higher sample
efficiency and more accurate and robust predictions. However, redesigning each
component of prevalent deep neural network architectures to achieve chosen
equivariance is a difficult problem and can result in a computationally
expensive network during both training and inference. A recently proposed
alternative towards equivariance that removes the architectural constraints is
to use a simple canonicalization network that transforms the input to a
canonical form before feeding it to an unconstrained prediction network. We
show here that this approach can effectively be used to make a large pretrained
network equivariant. However, we observe that the produced canonical
orientations can be misaligned with those of the training distribution,
hindering performance. Using dataset-dependent priors to inform the
canonicalization function, we are able to make large pretrained models
equivariant while maintaining their performance. This significantly improves
the robustness of these models to deterministic transformations of the data,
such as rotations. We believe this equivariant adaptation of large pretrained
models can help their domain-specific applications with known symmetry priors.
Related papers
- Improving Equivariant Model Training via Constraint Relaxation [31.507956579770088]
Equivariant neural networks have been widely used in a variety of applications due to their ability to generalize well in tasks where the underlying data symmetries are known.
We propose a novel framework for improving the optimization of such models by relaxing the hard equivariance constraint during training.
We provide experimental results on different state-of-the-art network architectures, demonstrating how this training framework can result in equivariant models with improved generalization performance.
arXiv Detail & Related papers (2024-08-23T17:35:08Z) - Spatially-varying Regularization with Conditional Transformer for
Unsupervised Image Registration [11.498623409184225]
We introduce an end-to-end framework that uses neural networks to learn a deformation regularizer directly from data.
The proposed method is built upon a Transformer-based model, but it can be readily adapted to any network architecture.
arXiv Detail & Related papers (2023-03-10T19:11:16Z) - Self-Supervised Learning for Group Equivariant Neural Networks [75.62232699377877]
Group equivariant neural networks are the models whose structure is restricted to commute with the transformations on the input.
We propose two concepts for self-supervised tasks: equivariant pretext labels and invariant contrastive loss.
Experiments on standard image recognition benchmarks demonstrate that the equivariant neural networks exploit the proposed self-supervised tasks.
arXiv Detail & Related papers (2023-03-08T08:11:26Z) - Deep Neural Networks with Efficient Guaranteed Invariances [77.99182201815763]
We address the problem of improving the performance and in particular the sample complexity of deep neural networks.
Group-equivariant convolutions are a popular approach to obtain equivariant representations.
We propose a multi-stream architecture, where each stream is invariant to a different transformation.
arXiv Detail & Related papers (2023-03-02T20:44:45Z) - Faithful Heteroscedastic Regression with Neural Networks [2.2835610890984164]
Parametric methods that employ neural networks for parameter maps can capture complex relationships in the data.
We make two simple modifications to optimization to produce a heteroscedastic model with mean estimates that are provably as accurate as those from its homoscedastic counterpart.
Our approach provably retains the accuracy of an equally flexible mean-only model while also offering best-in-class variance calibration.
arXiv Detail & Related papers (2022-12-18T22:34:42Z) - The Lie Derivative for Measuring Learned Equivariance [84.29366874540217]
We study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures.
We find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities.
For example, transformers can be more equivariant than convolutional neural networks after training.
arXiv Detail & Related papers (2022-10-06T15:20:55Z) - Relaxing Equivariance Constraints with Non-stationary Continuous Filters [20.74154804898478]
The proposed parameterization can be thought of as a building block to allow adjustable symmetry structure in neural networks.
Compared to non-equivariant or strict-equivariant baselines, we experimentally verify that soft equivariance leads to improved performance in terms of test accuracy on CIFAR-10 and CIFAR-100 image classification tasks.
arXiv Detail & Related papers (2022-04-14T18:08:36Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Revisiting Transformation Invariant Geometric Deep Learning: Are Initial
Representations All You Need? [80.86819657126041]
We show that transformation-invariant and distance-preserving initial representations are sufficient to achieve transformation invariance.
Specifically, we realize transformation-invariant and distance-preserving initial point representations by modifying multi-dimensional scaling.
We prove that TinvNN can strictly guarantee transformation invariance, being general and flexible enough to be combined with the existing neural networks.
arXiv Detail & Related papers (2021-12-23T03:52:33Z) - Training or Architecture? How to Incorporate Invariance in Neural
Networks [14.162739081163444]
We propose a method for provably invariant network architectures with respect to group actions.
In a nutshell, we intend to 'undo' any possible transformation before feeding the data into the actual network.
We analyze properties of such approaches, extend them to equivariant networks, and demonstrate their advantages in terms of robustness as well as computational efficiency in several numerical examples.
arXiv Detail & Related papers (2021-06-18T10:31:00Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.