Contrastive Learning with Stronger Augmentations
- URL: http://arxiv.org/abs/2104.07713v1
- Date: Thu, 15 Apr 2021 18:40:04 GMT
- Title: Contrastive Learning with Stronger Augmentations
- Authors: Xiao Wang, Guo-Jun Qi
- Abstract summary: We propose a general framework called Contrastive Learning with Stronger Augmentations(A) to complement current contrastive learning approaches.
Here, the distribution divergence between the weakly and strongly augmented images over the representation bank is adopted to supervise the retrieval of strongly augmented queries.
Experiments showed the information from the strongly augmented images can significantly boost the performance.
- Score: 63.42057690741711
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Representation learning has significantly been developed with the advance of
contrastive learning methods. Most of those methods have benefited from various
data augmentations that are carefully designated to maintain their identities
so that the images transformed from the same instance can still be retrieved.
However, those carefully designed transformations limited us to further explore
the novel patterns exposed by other transformations. Meanwhile, as found in our
experiments, the strong augmentations distorted the images' structures,
resulting in difficult retrieval. Thus, we propose a general framework called
Contrastive Learning with Stronger Augmentations~(CLSA) to complement current
contrastive learning approaches. Here, the distribution divergence between the
weakly and strongly augmented images over the representation bank is adopted to
supervise the retrieval of strongly augmented queries from a pool of instances.
Experiments on the ImageNet dataset and downstream datasets showed the
information from the strongly augmented images can significantly boost the
performance. For example, CLSA achieves top-1 accuracy of 76.2% on ImageNet
with a standard ResNet-50 architecture with a single-layer classifier
fine-tuned, which is almost the same level as 76.5% of supervised results. The
code and pre-trained models are available in
https://github.com/maple-research-lab/CLSA.
Related papers
- Synergy and Diversity in CLIP: Enhancing Performance Through Adaptive Backbone Ensembling [58.50618448027103]
Contrastive Language-Image Pretraining (CLIP) stands out as a prominent method for image representation learning.
This paper explores the differences across various CLIP-trained vision backbones.
Method achieves a remarkable increase in accuracy of up to 39.1% over the best single backbone.
arXiv Detail & Related papers (2024-05-27T12:59:35Z) - Transformer-based Clipped Contrastive Quantization Learning for
Unsupervised Image Retrieval [15.982022297570108]
Unsupervised image retrieval aims to learn the important visual characteristics without any given level to retrieve the similar images for a given query image.
In this paper, we propose a TransClippedCLR model by encoding the global context of an image using Transformer having local context through patch based processing.
Results using the proposed clipped contrastive learning are greatly improved on all datasets as compared to same backbone network with vanilla contrastive learning.
arXiv Detail & Related papers (2024-01-27T09:39:11Z) - Feature transforms for image data augmentation [74.12025519234153]
In image classification, many augmentation approaches utilize simple image manipulation algorithms.
In this work, we build ensembles on the data level by adding images generated by combining fourteen augmentation approaches.
Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method.
arXiv Detail & Related papers (2022-01-24T14:12:29Z) - Weakly Supervised Contrastive Learning [68.47096022526927]
We introduce a weakly supervised contrastive learning framework (WCL) to tackle this issue.
WCL achieves 65% and 72% ImageNet Top-1 Accuracy using ResNet50, which is even higher than SimCLRv2 with ResNet101.
arXiv Detail & Related papers (2021-10-10T12:03:52Z) - AugNet: End-to-End Unsupervised Visual Representation Learning with
Image Augmentation [3.6790362352712873]
We propose AugNet, a new deep learning training paradigm to learn image features from a collection of unlabeled pictures.
Our experiments demonstrate that the method is able to represent the image in low dimensional space.
Unlike many deep-learning-based image retrieval algorithms, our approach does not require access to external annotated datasets.
arXiv Detail & Related papers (2021-06-11T09:02:30Z) - With a Little Help from My Friends: Nearest-Neighbor Contrastive
Learning of Visual Representations [87.72779294717267]
Using the nearest-neighbor as positive in contrastive losses improves performance significantly on ImageNet classification.
We demonstrate empirically that our method is less reliant on complex data augmentations.
arXiv Detail & Related papers (2021-04-29T17:56:08Z) - MetaAugment: Sample-Aware Data Augmentation Policy Learning [20.988767360529362]
We learn a sample-aware data augmentation policy efficiently by formulating it as a sample reweighting problem.
An augmentation policy network takes a transformation and the corresponding augmented image as inputs, and outputs a weight to adjust the augmented image loss computed by a task network.
At training stage, the task network minimizes the weighted losses of augmented training images, while the policy network minimizes the loss of the task network on a validation set via meta-learning.
arXiv Detail & Related papers (2020-12-22T15:19:27Z) - FeatMatch: Feature-Based Augmentation for Semi-Supervised Learning [64.32306537419498]
We propose a novel learned feature-based refinement and augmentation method that produces a varied set of complex transformations.
These transformations also use information from both within-class and across-class representations that we extract through clustering.
We demonstrate that our method is comparable to current state of art for smaller datasets while being able to scale up to larger datasets.
arXiv Detail & Related papers (2020-07-16T17:55:31Z) - Learning Test-time Augmentation for Content-based Image Retrieval [42.188013259368766]
Off-the-shelf convolutional neural network features achieve outstanding results in many image retrieval tasks.
Existing image retrieval approaches require fine-tuning or modification of pre-trained networks to adapt to variations unique to the target data.
Our method enhances the invariance of off-the-shelf features by aggregating features extracted from images augmented at test-time, with augmentations guided by a policy learned through reinforcement learning.
arXiv Detail & Related papers (2020-02-05T05:08:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.