Symmetry and Generalisation in Machine Learning
- URL: http://arxiv.org/abs/2501.03858v1
- Date: Tue, 07 Jan 2025 15:14:58 GMT
- Title: Symmetry and Generalisation in Machine Learning
- Authors: Hayder Elesedy,
- Abstract summary: We show that for any predictor that is not equivariant, there is an equivariant predictor with strictly lower test risk on all regression problems.
We adopt an alternative perspective and formalise the common intuition that learning with invariant models reduces to a problem in terms of orbit representatives.
- Score: 0.0
- License:
- Abstract: This work is about understanding the impact of invariance and equivariance on generalisation in supervised learning. We use the perspective afforded by an averaging operator to show that for any predictor that is not equivariant, there is an equivariant predictor with strictly lower test risk on all regression problems where the equivariance is correctly specified. This constitutes a rigorous proof that symmetry, in the form of invariance or equivariance, is a useful inductive bias. We apply these ideas to equivariance and invariance in random design least squares and kernel ridge regression respectively. This allows us to specify the reduction in expected test risk in more concrete settings and express it in terms of properties of the group, the model and the data. Along the way, we give examples and additional results to demonstrate the utility of the averaging operator approach in analysing equivariant predictors. In addition, we adopt an alternative perspective and formalise the common intuition that learning with invariant models reduces to a problem in terms of orbit representatives. The formalism extends naturally to a similar intuition for equivariant models. We conclude by connecting the two perspectives and giving some ideas for future work.
Related papers
- Equivariant score-based generative models provably learn distributions with symmetries efficiently [7.90752151686317]
Empirical studies have demonstrated that incorporating symmetries into generative models can provide better generalization and sampling efficiency.
We provide the first theoretical analysis and guarantees of score-based generative models (SGMs) for learning distributions that are invariant with respect to some group symmetry.
arXiv Detail & Related papers (2024-10-02T05:14:28Z) - Almost Equivariance via Lie Algebra Convolutions [0.0]
We provide a definition of almost equivariance and give a practical method for encoding it in models.
Specifically, we define Lie algebra convolutions and demonstrate that they offer several benefits over Lie group convolutions.
We prove two existence theorems, one showing the existence of almost isometries within bounded distance of isometries of a manifold, and another showing the converse for Hilbert spaces.
arXiv Detail & Related papers (2023-10-19T21:31:11Z) - Approximation-Generalization Trade-offs under (Approximate) Group
Equivariance [3.0458514384586395]
Group equivariant neural networks have demonstrated impressive performance across various domains and applications such as protein and drug design.
We show how models capturing task-specific symmetries lead to improved generalization.
We examine the more general question of model mis-specification when the model symmetries don't align with the data symmetries.
arXiv Detail & Related papers (2023-05-27T22:53:37Z) - Evaluating the Robustness of Interpretability Methods through
Explanation Invariance and Equivariance [72.50214227616728]
Interpretability methods are valuable only if their explanations faithfully describe the explained model.
We consider neural networks whose predictions are invariant under a specific symmetry group.
arXiv Detail & Related papers (2023-04-13T17:59:03Z) - Equivariant Disentangled Transformation for Domain Generalization under
Combination Shift [91.38796390449504]
Combinations of domains and labels are not observed during training but appear in the test environment.
We provide a unique formulation of the combination shift problem based on the concepts of homomorphism, equivariance, and a refined definition of disentanglement.
arXiv Detail & Related papers (2022-08-03T12:31:31Z) - On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data.
Invariance measures consistency of model predictions on transformations of the data.
From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - Counterfactual Invariance to Spurious Correlations: Why and How to Pass
Stress Tests [87.60900567941428]
A spurious correlation' is the dependence of a model on some aspect of the input data that an analyst thinks shouldn't matter.
In machine learning, these have a know-it-when-you-see-it character.
We study stress testing using the tools of causal inference.
arXiv Detail & Related papers (2021-05-31T14:39:38Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.