Structuring Representation Geometry with Rotationally Equivariant
Contrastive Learning
- URL: http://arxiv.org/abs/2306.13924v1
- Date: Sat, 24 Jun 2023 10:07:52 GMT
- Title: Structuring Representation Geometry with Rotationally Equivariant
Contrastive Learning
- Authors: Sharut Gupta, Joshua Robinson, Derek Lim, Soledad Villar, Stefanie
Jegelka
- Abstract summary: Self-supervised learning converts raw perceptual data such as images to a compact space where simple Euclidean distances measure meaningful variations in data.
We extend this formulation by adding additional geometric structure to the embedding space by enforcing transformations of input space to correspond to simple transformations of embedding space.
We show that merely combining our equivariant loss with a non-collapse term results in non-trivial representations.
- Score: 42.20218717636608
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning converts raw perceptual data such as images to a
compact space where simple Euclidean distances measure meaningful variations in
data. In this paper, we extend this formulation by adding additional geometric
structure to the embedding space by enforcing transformations of input space to
correspond to simple (i.e., linear) transformations of embedding space.
Specifically, in the contrastive learning setting, we introduce an equivariance
objective and theoretically prove that its minima forces augmentations on input
space to correspond to rotations on the spherical embedding space. We show that
merely combining our equivariant loss with a non-collapse term results in
non-trivial representations, without requiring invariance to data
augmentations. Optimal performance is achieved by also encouraging approximate
invariance, where input augmentations correspond to small rotations. Our
method, CARE: Contrastive Augmentation-induced Rotational Equivariance, leads
to improved performance on downstream tasks, and ensures sensitivity in
embedding space to important variations in data (e.g., color) that standard
contrastive methods do not achieve. Code is available at
https://github.com/Sharut/CARE.
Related papers
- Thinner Latent Spaces: Detecting dimension and imposing invariance through autoencoder gradient constraints [9.380902608139902]
We show that orthogonality relations within the latent layer of the network can be leveraged to infer the intrinsic dimensionality of nonlinear manifold data sets.
We outline the relevant theory relying on differential geometry, and describe the corresponding gradient-descent optimization algorithm.
arXiv Detail & Related papers (2024-08-28T20:56:35Z) - Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection [37.142470149311904]
We propose atemporal equivariant learning framework by considering both spatial and temporal augmentations jointly.
We show our pre-training method for 3D object detection which outperforms existing equivariant and invariant approaches in many settings.
arXiv Detail & Related papers (2024-04-17T20:41:49Z) - FRED: Towards a Full Rotation-Equivariance in Aerial Image Object
Detection [28.47314201641291]
We introduce a Fully Rotation-Equivariant Oriented Object Detector (FRED)
Our proposed method delivers comparable performance on DOTA-v1.0 and outperforms by 1.5 mAP on DOTA-v1.5, all while significantly reducing the model parameters to 16%.
arXiv Detail & Related papers (2023-12-22T09:31:43Z) - Gradient-Based Feature Learning under Structured Data [57.76552698981579]
In the anisotropic setting, the commonly used spherical gradient dynamics may fail to recover the true direction.
We show that appropriate weight normalization that is reminiscent of batch normalization can alleviate this issue.
In particular, under the spiked model with a suitably large spike, the sample complexity of gradient-based training can be made independent of the information exponent.
arXiv Detail & Related papers (2023-09-07T16:55:50Z) - EquiMod: An Equivariance Module to Improve Self-Supervised Learning [77.34726150561087]
Self-supervised visual representation methods are closing the gap with supervised learning performance.
These methods rely on maximizing the similarity between embeddings of related synthetic inputs created through data augmentations.
We introduce EquiMod a generic equivariance module that structures the learned latent space.
arXiv Detail & Related papers (2022-11-02T16:25:54Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - Roto-Translation Equivariant Super-Resolution of Two-Dimensional Flows
Using Convolutional Neural Networks [0.15229257192293202]
Convolutional neural networks (CNNs) often process vectors as quantities having no direction like colors in images.
This study investigates the effect of treating vectors as geometrical objects in terms of super-resolution of velocity on two-dimensional fluids.
arXiv Detail & Related papers (2022-02-22T07:07:07Z) - Improving the Sample-Complexity of Deep Classification Networks with
Invariant Integration [77.99182201815763]
Leveraging prior knowledge on intraclass variance due to transformations is a powerful method to improve the sample complexity of deep neural networks.
We propose a novel monomial selection algorithm based on pruning methods to allow an application to more complex problems.
We demonstrate the improved sample complexity on the Rotated-MNIST, SVHN and CIFAR-10 datasets.
arXiv Detail & Related papers (2022-02-08T16:16:11Z) - Optimization for Oriented Object Detection via Representation Invariance
Loss [2.501282372971187]
mainstream rotation detectors use oriented bounding boxes (OBB) or quadrilateral bounding boxes (QBB) to represent the rotating objects.
We propose a Representation Invariance Loss (RIL) to optimize the bounding box regression for the rotating objects.
Our method achieves consistent and substantial improvement in experiments on remote sensing datasets and scene text datasets.
arXiv Detail & Related papers (2021-03-22T07:55:33Z) - Invariant Integration in Deep Convolutional Feature Space [77.99182201815763]
We show how to incorporate prior knowledge to a deep neural network architecture in a principled manner.
We report state-of-the-art performance on the Rotated-MNIST dataset.
arXiv Detail & Related papers (2020-04-20T09:45:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.