Distribution Estimation to Automate Transformation Policies for
Self-Supervision
- URL: http://arxiv.org/abs/2111.12265v1
- Date: Wed, 24 Nov 2021 04:40:00 GMT
- Title: Distribution Estimation to Automate Transformation Policies for
Self-Supervision
- Authors: Seunghan Yang, Debasmit Das, Simyung Chang, Sungrack Yun, Fatih
Porikli
- Abstract summary: In recent visual self-supervision works, an imitated classification objective, called pretext task, is established by assigning labels to transformed or augmented input images.
It is observed that image transformations already present in the dataset might be less effective in learning such self-supervised representations.
We propose a framework based on generative adversarial network to automatically find the transformations which are not present in the input dataset.
- Score: 61.55875498848597
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In recent visual self-supervision works, an imitated classification
objective, called pretext task, is established by assigning labels to
transformed or augmented input images. The goal of pretext can be predicting
what transformations are applied to the image. However, it is observed that
image transformations already present in the dataset might be less effective in
learning such self-supervised representations. Building on this observation, we
propose a framework based on generative adversarial network to automatically
find the transformations which are not present in the input dataset and thus
effective for the self-supervised learning. This automated policy allows to
estimate the transformation distribution of a dataset and also construct its
complementary distribution from which training pairs are sampled for the
pretext task. We evaluated our framework using several visual recognition
datasets to show the efficacy of our automated transformation policy.
Related papers
- Distill-SODA: Distilling Self-Supervised Vision Transformer for
Source-Free Open-Set Domain Adaptation in Computational Pathology [12.828728138651266]
Development of computational pathology models is essential for reducing manual tissue typing from whole slide images.
We propose a practical setting by addressing the above-mentioned challenges in one fell swoop, i.e., source-free open-set domain adaptation.
Our methodology focuses on adapting a pre-trained source model to an unlabeled target dataset.
arXiv Detail & Related papers (2023-07-10T14:36:51Z) - Learning Explicit Object-Centric Representations with Vision
Transformers [81.38804205212425]
We build on the self-supervision task of masked autoencoding and explore its effectiveness for learning object-centric representations with transformers.
We show that the model efficiently learns to decompose simple scenes as measured by segmentation metrics on several multi-object benchmarks.
arXiv Detail & Related papers (2022-10-25T16:39:49Z) - Adapting Self-Supervised Vision Transformers by Probing
Attention-Conditioned Masking Consistency [7.940705941237998]
We propose PACMAC, a simple two-stage adaptation algorithm for self-supervised ViTs.
Our simple approach leads to consistent performance gains over competing methods.
arXiv Detail & Related papers (2022-06-16T14:46:10Z) - Prefix Conditioning Unifies Language and Label Supervision [84.11127588805138]
We show that dataset biases negatively affect pre-training by reducing the generalizability of learned representations.
In experiments, we show that this simple technique improves the performance in zero-shot image recognition accuracy and robustness to the image-level distribution shift.
arXiv Detail & Related papers (2022-06-02T16:12:26Z) - Robust Training Using Natural Transformation [19.455666609149567]
We present NaTra, an adversarial training scheme to improve robustness of image classification algorithms.
We target attributes of the input images that are independent of the class identification, and manipulate those attributes to mimic real-world natural transformations.
We demonstrate the efficacy of our scheme by utilizing the disentangled latent representations derived from well-trained GANs.
arXiv Detail & Related papers (2021-05-10T01:56:03Z) - Self-supervised Augmentation Consistency for Adapting Semantic
Segmentation [56.91850268635183]
We propose an approach to domain adaptation for semantic segmentation that is both practical and highly accurate.
We employ standard data augmentation techniques $-$ photometric noise, flipping and scaling $-$ and ensure consistency of the semantic predictions.
We achieve significant improvements of the state-of-the-art segmentation accuracy after adaptation, consistent both across different choices of the backbone architecture and adaptation scenarios.
arXiv Detail & Related papers (2021-04-30T21:32:40Z) - Instance Localization for Self-supervised Detection Pretraining [68.24102560821623]
We propose a new self-supervised pretext task, called instance localization.
We show that integration of bounding boxes into pretraining promotes better task alignment and architecture alignment for transfer learning.
Experimental results demonstrate that our approach yields state-of-the-art transfer learning results for object detection.
arXiv Detail & Related papers (2021-02-16T17:58:57Z) - Self-Supervised Human Activity Recognition by Augmenting Generative
Adversarial Networks [0.0]
This article proposes a novel approach for augmenting generative adversarial network (GAN) with a self-supervised task.
In the proposed method, input video frames are randomly transformed by different spatial transformations.
discriminator is encouraged to predict the applied transformation by introducing an auxiliary loss.
arXiv Detail & Related papers (2020-08-26T18:28:17Z) - Probabilistic Spatial Transformer Networks [0.6999740786886537]
We propose a probabilistic extension that estimates a transformation rather than a deterministic one.
We show that these two properties lead to improved classification performance, robustness and model calibration.
We further demonstrate that the approach generalizes to non-visual domains by improving model performance on time-series data.
arXiv Detail & Related papers (2020-04-07T18:22:02Z) - Learning Representations by Predicting Bags of Visual Words [55.332200948110895]
Self-supervised representation learning targets to learn convnet-based image representations from unlabeled data.
Inspired by the success of NLP methods in this area, in this work we propose a self-supervised approach based on spatially dense image descriptions.
arXiv Detail & Related papers (2020-02-27T16:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.