On the Implicit Bias of Linear Equivariant Steerable Networks
- URL: http://arxiv.org/abs/2303.04198v2
- Date: Fri, 5 May 2023 04:10:21 GMT
- Title: On the Implicit Bias of Linear Equivariant Steerable Networks
- Authors: Ziyu Chen, Wei Zhu
- Abstract summary: We study the implicit bias of gradient flow on linear equivariant steerable networks in group-invariant binary classification.
Under a unitary assumption on the input representation, we establish the equivalence between steerable networks and data augmentation.
- Score: 9.539074889921935
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the implicit bias of gradient flow on linear equivariant steerable
networks in group-invariant binary classification. Our findings reveal that the
parameterized predictor converges in direction to the unique group-invariant
classifier with a maximum margin defined by the input group action. Under a
unitary assumption on the input representation, we establish the equivalence
between steerable networks and data augmentation. Furthermore, we demonstrate
the improved margin and generalization bound of steerable networks over their
non-invariant counterparts.
Related papers
- Interpreting Equivariant Representations [5.325297567945828]
In this paper, we demonstrate that the inductive bias imposed on the by an equivariant model must also be taken into account when using latent representations.
We show how not accounting for the inductive biases leads to decreased performance on downstream tasks, and vice versa.
arXiv Detail & Related papers (2024-01-23T09:43:30Z) - A Characterization Theorem for Equivariant Networks with Point-wise
Activations [13.00676132572457]
We prove that rotation-equivariant networks can only be invariant, as it happens for any network which is equivariant with respect to connected compact groups.
We show that feature spaces of disentangled steerable convolutional neural networks are trivial representations.
arXiv Detail & Related papers (2024-01-17T14:30:46Z) - Geometry of Linear Neural Networks: Equivariance and Invariance under
Permutation Groups [0.0]
We investigate the subvariety of functions that are equivariant or invariant under the action of a permutation group.
We draw conclusions for the parameterization and the design of equivariant and invariant linear networks.
arXiv Detail & Related papers (2023-09-24T19:40:15Z) - Optimization Dynamics of Equivariant and Augmented Neural Networks [2.7918308693131135]
We investigate the optimization of neural networks on symmetric data.
We compare the strategy of constraining the architecture to be equivariant to that of using data augmentation.
Our analysis reveals that even in the latter situation, stationary points may be unstable for augmented training although they are stable for the manifestly equivariant models.
arXiv Detail & Related papers (2023-03-23T17:26:12Z) - Self-Supervised Learning for Group Equivariant Neural Networks [75.62232699377877]
Group equivariant neural networks are the models whose structure is restricted to commute with the transformations on the input.
We propose two concepts for self-supervised tasks: equivariant pretext labels and invariant contrastive loss.
Experiments on standard image recognition benchmarks demonstrate that the equivariant neural networks exploit the proposed self-supervised tasks.
arXiv Detail & Related papers (2023-03-08T08:11:26Z) - On the Effective Number of Linear Regions in Shallow Univariate ReLU
Networks: Convergence Guarantees and Implicit Bias [50.84569563188485]
We show that gradient flow converges in direction when labels are determined by the sign of a target network with $r$ neurons.
Our result may already hold for mild over- parameterization, where the width is $tildemathcalO(r)$ and independent of the sample size.
arXiv Detail & Related papers (2022-05-18T16:57:10Z) - Topographic VAEs learn Equivariant Capsules [84.33745072274942]
We introduce the Topographic VAE: a novel method for efficiently training deep generative models with topographically organized latent variables.
We show that such a model indeed learns to organize its activations according to salient characteristics such as digit class, width, and style on MNIST.
We demonstrate approximate equivariance to complex transformations, expanding upon the capabilities of existing group equivariant neural networks.
arXiv Detail & Related papers (2021-09-03T09:25:57Z) - Directional Convergence Analysis under Spherically Symmetric
Distribution [21.145823611499104]
We consider the fundamental problem of learning linear predictors (i.e., separable datasets with zero margin) using neural networks with gradient flow or gradient descent.
We show directional convergence guarantees with exact convergence rate for two-layer non-linear networks with only two hidden nodes, and (deep) linear networks.
arXiv Detail & Related papers (2021-05-09T08:59:58Z) - GroupifyVAE: from Group-based Definition to VAE-based Unsupervised
Representation Disentanglement [91.9003001845855]
VAE-based unsupervised disentanglement can not be achieved without introducing other inductive bias.
We address VAE-based unsupervised disentanglement by leveraging the constraints derived from the Group Theory based definition as the non-probabilistic inductive bias.
We train 1800 models covering the most prominent VAE-based models on five datasets to verify the effectiveness of our method.
arXiv Detail & Related papers (2021-02-20T09:49:51Z) - LieTransformer: Equivariant self-attention for Lie Groups [49.9625160479096]
Group equivariant neural networks are used as building blocks of group invariant neural networks.
We extend the scope of the literature to self-attention, that is emerging as a prominent building block of deep learning models.
We propose the LieTransformer, an architecture composed of LieSelfAttention layers that are equivariant to arbitrary Lie groups and their discrete subgroups.
arXiv Detail & Related papers (2020-12-20T11:02:49Z) - Learning Invariances in Neural Networks [51.20867785006147]
We show how to parameterize a distribution over augmentations and optimize the training loss simultaneously with respect to the network parameters and augmentation parameters.
We can recover the correct set and extent of invariances on image classification, regression, segmentation, and molecular property prediction from a large space of augmentations.
arXiv Detail & Related papers (2020-10-22T17:18:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.