Provably Strict Generalisation Benefit for Equivariant Models
- URL: http://arxiv.org/abs/2102.10333v1
- Date: Sat, 20 Feb 2021 12:47:32 GMT
- Title: Provably Strict Generalisation Benefit for Equivariant Models
- Authors: Bryn Elesedy and Sheheryar Zaidi
- Abstract summary: It is widely believed that engineering a model to be invariant/equivariant improves generalisation.
This paper provides the first provably non-zero improvement in generalisation for invariant/equivariant models.
- Score: 1.332560004325655
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is widely believed that engineering a model to be invariant/equivariant
improves generalisation. Despite the growing popularity of this approach, a
precise characterisation of the generalisation benefit is lacking. By
considering the simplest case of linear models, this paper provides the first
provably non-zero improvement in generalisation for invariant/equivariant
models when the target distribution is invariant/equivariant with respect to a
compact group. Moreover, our work reveals an interesting relationship between
generalisation, the number of training examples and properties of the group
action. Our results rest on an observation of the structure of function spaces
under averaging operators which, along with its consequences for feature
averaging, may be of independent interest.
Related papers
- Scaling and renormalization in high-dimensional regression [72.59731158970894]
This paper presents a succinct derivation of the training and generalization performance of a variety of high-dimensional ridge regression models.
We provide an introduction and review of recent results on these topics, aimed at readers with backgrounds in physics and deep learning.
arXiv Detail & Related papers (2024-05-01T15:59:00Z) - Domain Generalization In Robust Invariant Representation [10.132611239890345]
In this paper, we investigate the generalization of invariant representations on out-of-distribution data.
We show that the invariant model learns unstructured latent representations that are robust to distribution shifts.
arXiv Detail & Related papers (2023-04-07T00:58:30Z) - Generalized Invariant Matching Property via LASSO [19.786769414376323]
In this work, we generalize the invariant matching property by formulating a high-dimensional problem with intrinsic sparsity.
We propose a more robust and computation-efficient algorithm by leveraging a variant of Lasso.
arXiv Detail & Related papers (2023-01-14T21:09:30Z) - Instance-Dependent Generalization Bounds via Optimal Transport [51.71650746285469]
Existing generalization bounds fail to explain crucial factors that drive the generalization of modern neural networks.
We derive instance-dependent generalization bounds that depend on the local Lipschitz regularity of the learned prediction function in the data space.
We empirically analyze our generalization bounds for neural networks, showing that the bound values are meaningful and capture the effect of popular regularization methods during training.
arXiv Detail & Related papers (2022-11-02T16:39:42Z) - A PAC-Bayesian Generalization Bound for Equivariant Networks [15.27608414735815]
We derive norm-based PAC-Bayesian generalization bounds for equivariant networks.
The bound characterizes the impact of group size, and multiplicity and degree of irreducible representations on the generalization error.
In general, the bound indicates that using larger group size in the model improves the generalization error substantiated by extensive numerical experiments.
arXiv Detail & Related papers (2022-10-24T12:07:03Z) - Predicting Out-of-Domain Generalization with Neighborhood Invariance [59.05399533508682]
We propose a measure of a classifier's output invariance in a local transformation neighborhood.
Our measure is simple to calculate, does not depend on the test point's true label, and can be applied even in out-of-domain (OOD) settings.
In experiments on benchmarks in image classification, sentiment analysis, and natural language inference, we demonstrate a strong and robust correlation between our measure and actual OOD generalization.
arXiv Detail & Related papers (2022-07-05T14:55:16Z) - An Invariant Matching Property for Distribution Generalization under
Intervened Response [19.786769414376323]
We show a novel form of invariance by incorporating the estimates of certain features as additional predictors.
We provide an explicit characterization of the linear matching and present our simulation results under various intervention settings.
arXiv Detail & Related papers (2022-05-18T18:25:21Z) - LieTransformer: Equivariant self-attention for Lie Groups [49.9625160479096]
Group equivariant neural networks are used as building blocks of group invariant neural networks.
We extend the scope of the literature to self-attention, that is emerging as a prominent building block of deep learning models.
We propose the LieTransformer, an architecture composed of LieSelfAttention layers that are equivariant to arbitrary Lie groups and their discrete subgroups.
arXiv Detail & Related papers (2020-12-20T11:02:49Z) - Estimating Generalization under Distribution Shifts via Domain-Invariant
Representations [75.74928159249225]
We use a set of domain-invariant predictors as a proxy for the unknown, true target labels.
The error of the resulting risk estimate depends on the target risk of the proxy model.
arXiv Detail & Related papers (2020-07-06T17:21:24Z) - Convex Representation Learning for Generalized Invariance in
Semi-Inner-Product Space [32.442549424823355]
In this work we develop an algorithm for a variety of generalized representations in a semi-norms that representers in a lead, and bounds are established.
This allows in representations to be learned efficiently and effectively as confirmed in our experiments along with accurate predictions.
arXiv Detail & Related papers (2020-04-25T18:54:37Z) - Invariant Feature Coding using Tensor Product Representation [75.62232699377877]
We prove that the group-invariant feature vector contains sufficient discriminative information when learning a linear classifier.
A novel feature model that explicitly consider group action is proposed for principal component analysis and k-means clustering.
arXiv Detail & Related papers (2019-06-05T07:15:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.