Data Augmentation vs. Equivariant Networks: A Theory of Generalization
on Dynamics Forecasting
- URL: http://arxiv.org/abs/2206.09450v1
- Date: Sun, 19 Jun 2022 17:00:12 GMT
- Title: Data Augmentation vs. Equivariant Networks: A Theory of Generalization
on Dynamics Forecasting
- Authors: Rui Wang, Robin Walters, Rose Yu
- Abstract summary: Exploiting symmetry in dynamical systems is a powerful way to improve the generalization of deep learning.
Data augmentation and equivariant networks are two major approaches to injecting symmetry into learning.
We derive the generalization bounds for data augmentation and equivariant networks, characterizing their effect on learning in a unified framework.
- Score: 24.363954435050264
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Exploiting symmetry in dynamical systems is a powerful way to improve the
generalization of deep learning. The model learns to be invariant to
transformation and hence is more robust to distribution shift. Data
augmentation and equivariant networks are two major approaches to injecting
symmetry into learning. However, their exact role in improving generalization
is not well understood. In this work, we derive the generalization bounds for
data augmentation and equivariant networks, characterizing their effect on
learning in a unified framework. Unlike most prior theories for the i.i.d.
setting, we focus on non-stationary dynamics forecasting with complex temporal
dependencies.
Related papers
- Learning Divergence Fields for Shift-Robust Graph Representations [73.11818515795761]
In this work, we propose a geometric diffusion model with learnable divergence fields for the challenging problem with interdependent data.
We derive a new learning objective through causal inference, which can guide the model to learn generalizable patterns of interdependence that are insensitive across domains.
arXiv Detail & Related papers (2024-06-07T14:29:21Z) - Equivariant Adaptation of Large Pretrained Models [20.687626756753563]
We show that a canonicalization network can effectively be used to make a large pretrained network equivariant.
Using dataset-dependent priors to inform the canonicalization function, we are able to make large pretrained models equivariant while maintaining their performance.
arXiv Detail & Related papers (2023-10-02T21:21:28Z) - DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained
Diffusion [66.21290235237808]
We introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states.
We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs.
Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks.
arXiv Detail & Related papers (2023-01-23T15:18:54Z) - The good, the bad and the ugly sides of data augmentation: An implicit
spectral regularization perspective [14.229855423083922]
Data augmentation (DA) is a powerful workhorse for bolstering performance in modern machine learning.
In this work, we develop a new theoretical framework to characterize the impact of a general class of DA on generalization.
Our framework highlights the nuanced and sometimes surprising impacts of DA on generalization, and serves as a testbed for novel augmentation design.
arXiv Detail & Related papers (2022-10-10T21:30:46Z) - Regularising for invariance to data augmentation improves supervised
learning [82.85692486314949]
We show that using multiple augmentations per input can improve generalisation.
We propose an explicit regulariser that encourages this invariance on the level of individual model predictions.
arXiv Detail & Related papers (2022-03-07T11:25:45Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Deep invariant networks with differentiable augmentation layers [87.22033101185201]
Methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems.
We show that our approach is easier and faster to train than modern automatic data augmentation techniques.
arXiv Detail & Related papers (2022-02-04T14:12:31Z) - More Is More -- Narrowing the Generalization Gap by Adding
Classification Heads [8.883733362171032]
We introduce an architecture enhancement for existing neural network models based on input transformations, termed 'TransNet'
Our model can be employed during training time only and then pruned for prediction, resulting in an equivalent architecture to the base model.
arXiv Detail & Related papers (2021-02-09T16:30:33Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z) - Incorporating Symmetry into Deep Dynamics Models for Improved
Generalization [24.363954435050264]
We propose to improve accuracy and generalization by incorporating symmetries into convolutional neural networks.
Our models are theoretically and experimentally robust to distributional shift by symmetry group transformations.
Compared with image or text applications, our work is a significant step towards applying equivariant neural networks to high-dimensional systems.
arXiv Detail & Related papers (2020-02-08T01:28:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.