Effect of Rotation Angle in Self-Supervised Pre-training is Dataset-Dependent
- URL: http://arxiv.org/abs/2407.05218v1
- Date: Fri, 21 Jun 2024 12:25:07 GMT
- Title: Effect of Rotation Angle in Self-Supervised Pre-training is Dataset-Dependent
- Authors: Amy Saranchuk, Michael Guerzhoy,
- Abstract summary: Self-supervised learning for pre-training can help the network learn better low-level features.
In contrastive pre-training, the network is pre-trained to distinguish between different versions of the input.
We show that, when training using contrastive pre-training in this way, the angle $theta$ and the dataset interact in interesting ways.
- Score: 3.434553688053531
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised learning for pre-training (SSP) can help the network learn better low-level features, especially when the size of the training set is small. In contrastive pre-training, the network is pre-trained to distinguish between different versions of the input. For example, the network learns to distinguish pairs (original, rotated) of images where the rotated image was rotated by angle $\theta$ vs. other pairs of images. In this work, we show that, when training using contrastive pre-training in this way, the angle $\theta$ and the dataset interact in interesting ways. We hypothesize, and give some evidence, that, for some datasets, the network can take "shortcuts" for particular rotation angles $\theta$ based on the distribution of the gradient directions in the input, possibly avoiding learning features other than edges, but our experiments do not seem to support that hypothesis. We demonstrate experiments on three radiology datasets. We compute the saliency map indicating which pixels were important in the SSP process, and compare the saliency map to the ground truth foreground/background segmentation. Our visualizations indicate that the effects of rotation angles in SSP are dataset-dependent. We believe the distribution of gradient orientations may play a role in this, but our experiments so far are inconclusive.
Related papers
- On the Influence of Shape, Texture and Color for Learning Semantic Segmentation [5.172964916120902]
In recent years, a body of works has emerged, studying shape and texture biases of off-the-shelf pre-trained deep neural networks (DNN) for image classification.
We study these questions on semantic segmentation which allows us to address our questions on pixel level.
Our study on three datasets reveals that neither texture nor shape clearly dominate the learning success, however a combination of shape and color but without texture achieves surprisingly strong results.
arXiv Detail & Related papers (2024-10-18T21:52:02Z) - CSP: Self-Supervised Contrastive Spatial Pre-Training for
Geospatial-Visual Representations [90.50864830038202]
We present Contrastive Spatial Pre-Training (CSP), a self-supervised learning framework for geo-tagged images.
We use a dual-encoder to separately encode the images and their corresponding geo-locations, and use contrastive objectives to learn effective location representations from images.
CSP significantly boosts the model performance with 10-34% relative improvement with various labeled training data sampling ratios.
arXiv Detail & Related papers (2023-05-01T23:11:18Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - Semantic-aware Dense Representation Learning for Remote Sensing Image
Change Detection [20.761672725633936]
Training deep learning-based change detection model heavily depends on labeled data.
Recent trend is using remote sensing (RS) data to obtain in-domain representations via supervised or self-supervised learning (SSL)
We propose dense semantic-aware pre-training for RS image CD via sampling multiple class-balanced points.
arXiv Detail & Related papers (2022-05-27T06:08:33Z) - Data Augmentation for Object Detection via Differentiable Neural
Rendering [71.00447761415388]
It is challenging to train a robust object detector when annotated data is scarce.
Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data.
We introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views.
arXiv Detail & Related papers (2021-03-04T06:31:06Z) - Neural-Pull: Learning Signed Distance Functions from Point Clouds by
Learning to Pull Space onto Surfaces [68.12457459590921]
Reconstructing continuous surfaces from 3D point clouds is a fundamental operation in 3D geometry processing.
We introduce textitNeural-Pull, a new approach that is simple and leads to high quality SDFs.
arXiv Detail & Related papers (2020-11-26T23:18:10Z) - Deep Positional and Relational Feature Learning for Rotation-Invariant
Point Cloud Analysis [107.9979381402172]
We propose a rotation-invariant deep network for point clouds analysis.
The network is hierarchical and relies on two modules: a positional feature embedding block and a relational feature embedding block.
Experiments show state-of-the-art classification and segmentation performances on benchmark datasets.
arXiv Detail & Related papers (2020-11-18T04:16:51Z) - What Do Neural Networks Learn When Trained With Random Labels? [20.54410239839646]
We study deep neural networks (DNNs) trained on natural image data with entirely random labels.
We show analytically for convolutional and fully connected networks that an alignment between the principal components of network parameters and data takes place when training with random labels.
We show how this alignment produces a positive transfer: networks pre-trained with random labels train faster downstream compared to training from scratch.
arXiv Detail & Related papers (2020-06-18T12:07:22Z) - How Can CNNs Use Image Position for Segmentation? [23.98839374194848]
A recent study shows that the zero-padding employed in convolutional layers of CNNs provides position information to the CNNs.
However, there is a technical issue with the design of the experiments of the study, and thus the correctness of the claim is yet to be verified.
arXiv Detail & Related papers (2020-05-07T13:38:13Z) - Single Image Depth Estimation Trained via Depth from Defocus Cues [105.67073923825842]
Estimating depth from a single RGB image is a fundamental task in computer vision.
In this work, we rely, instead of different views, on depth from focus cues.
We present results that are on par with supervised methods on KITTI and Make3D datasets and outperform unsupervised learning approaches.
arXiv Detail & Related papers (2020-01-14T20:22:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.