Equivariance Discovery by Learned Parameter-Sharing
- URL: http://arxiv.org/abs/2204.03640v1
- Date: Thu, 7 Apr 2022 17:59:19 GMT
- Title: Equivariance Discovery by Learned Parameter-Sharing
- Authors: Raymond A. Yeh, Yuan-Ting Hu, Mark Hasegawa-Johnson, Alexander G.
Schwing
- Abstract summary: We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
- Score: 153.41877129746223
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Designing equivariance as an inductive bias into deep-nets has been a
prominent approach to build effective models, e.g., a convolutional neural
network incorporates translation equivariance. However, incorporating these
inductive biases requires knowledge about the equivariance properties of the
data, which may not be available, e.g., when encountering a new domain. To
address this, we study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem
over a model's parameter-sharing schemes. We propose to use the partition
distance to empirically quantify the accuracy of the recovered equivariance.
Also, we theoretically analyze the method for Gaussian data and provide a bound
on the mean squared gap between the studied discovery scheme and the oracle
scheme. Empirically, we show that the approach recovers known equivariances,
such as permutations and shifts, on sum of numbers and spatially-invariant
data.
Related papers
- Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - What Affects Learned Equivariance in Deep Image Recognition Models? [10.590129221143222]
We find evidence for a correlation between learned translation equivariance and validation accuracy on ImageNet.
Data augmentation, reduced model capacity and inductive bias in the form of convolutions induce higher learned equivariance in neural networks.
arXiv Detail & Related papers (2023-04-05T17:54:25Z) - The Lie Derivative for Measuring Learned Equivariance [84.29366874540217]
We study the equivariance properties of hundreds of pretrained models, spanning CNNs, transformers, and Mixer architectures.
We find that many violations of equivariance can be linked to spatial aliasing in ubiquitous network layers, such as pointwise non-linearities.
For example, transformers can be more equivariant than convolutional neural networks after training.
arXiv Detail & Related papers (2022-10-06T15:20:55Z) - Equivariant Disentangled Transformation for Domain Generalization under
Combination Shift [91.38796390449504]
Combinations of domains and labels are not observed during training but appear in the test environment.
We provide a unique formulation of the combination shift problem based on the concepts of homomorphism, equivariance, and a refined definition of disentanglement.
arXiv Detail & Related papers (2022-08-03T12:31:31Z) - On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data.
Invariance measures consistency of model predictions on transformations of the data.
From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z) - Group equivariant neural posterior estimation [9.80649677905172]
Group equivariant neural posterior estimation (GNPE) is based on self-consistently standardizing the "pose" of the data.
We show GNPE achieves state-of-the-art accuracy while reducing inference times by three orders of magnitude.
arXiv Detail & Related papers (2021-11-25T15:50:01Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z) - What causes the test error? Going beyond bias-variance via ANOVA [21.359033212191218]
Modern machine learning methods are often overparametrized, allowing adaptation to the data at a fine level.
Recent work aimed to understand in greater depth why overparametrization is helpful for generalization.
We propose using the analysis of variance (ANOVA) to decompose the variance in the test error in a symmetric way.
arXiv Detail & Related papers (2020-10-11T05:21:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.