Batch Curation for Unsupervised Contrastive Representation Learning
- URL: http://arxiv.org/abs/2108.08643v1
- Date: Thu, 19 Aug 2021 12:14:50 GMT
- Title: Batch Curation for Unsupervised Contrastive Representation Learning
- Authors: Michael C. Welle, Petra Poklukar and Danica Kragic
- Abstract summary: We introduce a textitbatch curation scheme that selects batches during the training process that are more inline with the underlying contrastive objective.
We provide insights into what constitutes beneficial similar and dissimilar pairs as well as validate textitbatch curation on CIFAR10.
- Score: 21.83249229426828
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The state-of-the-art unsupervised contrastive visual representation learning
methods that have emerged recently (SimCLR, MoCo, SwAV) all make use of data
augmentations in order to construct a pretext task of instant discrimination
consisting of similar and dissimilar pairs of images. Similar pairs are
constructed by randomly extracting patches from the same image and applying
several other transformations such as color jittering or blurring, while
transformed patches from different image instances in a given batch are
regarded as dissimilar pairs. We argue that this approach can result similar
pairs that are \textit{semantically} dissimilar. In this work, we address this
problem by introducing a \textit{batch curation} scheme that selects batches
during the training process that are more inline with the underlying
contrastive objective. We provide insights into what constitutes beneficial
similar and dissimilar pairs as well as validate \textit{batch curation} on
CIFAR10 by integrating it in the SimCLR model.
Related papers
- Unsupervised Representation Learning by Balanced Self Attention Matching [2.3020018305241337]
We present a self-supervised method for embedding image features called BAM.
We obtain rich representations and avoid feature collapse by minimizing a loss that matches these distributions to their globally balanced and entropy regularized version.
We show competitive performance with leading methods on both semi-supervised and transfer-learning benchmarks.
arXiv Detail & Related papers (2024-08-04T12:52:44Z) - Patch-Wise Self-Supervised Visual Representation Learning: A Fine-Grained Approach [4.9204263448542465]
This study introduces an innovative, fine-grained dimension by integrating patch-level discrimination into self-supervised visual representation learning.
We employ a distinctive photometric patch-level augmentation, where each patch is individually augmented, independent from other patches within the same view.
We present a simple yet effective patch-matching algorithm to find the corresponding patches across the augmented views.
arXiv Detail & Related papers (2023-10-28T09:35:30Z) - Inter-Instance Similarity Modeling for Contrastive Learning [22.56316444504397]
We propose a novel image mix method, PatchMix, for contrastive learning in Vision Transformer (ViT)
Compared to the existing sample mix methods, our PatchMix can flexibly and efficiently mix more than two images.
Our proposed method significantly outperforms the previous state-of-the-art on both ImageNet-1K and CIFAR datasets.
arXiv Detail & Related papers (2023-06-21T13:03:47Z) - Asymmetric Patch Sampling for Contrastive Learning [17.922853312470398]
Asymmetric appearance between positive pair effectively reduces the risk of representation degradation in contrastive learning.
We propose a novel asymmetric patch sampling strategy for contrastive learning, to boost the appearance asymmetry for better representations.
arXiv Detail & Related papers (2023-06-05T13:10:48Z) - Improving Cross-Modal Retrieval with Set of Diverse Embeddings [19.365974066256026]
Cross-modal retrieval across image and text modalities is a challenging task due to its inherent ambiguity.
Set-based embedding has been studied as a solution to this problem.
We present a novel set-based embedding method, which is distinct from previous work in two aspects.
arXiv Detail & Related papers (2022-11-30T05:59:23Z) - Attributable Visual Similarity Learning [90.69718495533144]
This paper proposes an attributable visual similarity learning (AVSL) framework for a more accurate and explainable similarity measure between images.
Motivated by the human semantic similarity cognition, we propose a generalized similarity learning paradigm to represent the similarity between two images with a graph.
Experiments on the CUB-200-2011, Cars196, and Stanford Online Products datasets demonstrate significant improvements over existing deep similarity learning methods.
arXiv Detail & Related papers (2022-03-28T17:35:31Z) - Seed the Views: Hierarchical Semantic Alignment for Contrastive
Representation Learning [116.91819311885166]
We propose a hierarchical semantic alignment strategy via expanding the views generated by a single image to textbfCross-samples and Multi-level representation.
Our method, termed as CsMl, has the ability to integrate multi-level visual representations across samples in a robust way.
arXiv Detail & Related papers (2020-12-04T17:26:24Z) - Support-set bottlenecks for video-text representation learning [131.4161071785107]
The dominant paradigm for learning video-text representations -- noise contrastive learning -- is too strict.
We propose a novel method that alleviates this by leveraging a generative model to naturally push these related samples together.
Our proposed method outperforms others by a large margin on MSR-VTT, VATEX and ActivityNet, and MSVD for video-to-text and text-to-video retrieval.
arXiv Detail & Related papers (2020-10-06T15:38:54Z) - Contrastive Learning for Unpaired Image-to-Image Translation [64.47477071705866]
In image-to-image translation, each patch in the output should reflect the content of the corresponding patch in the input, independent of domain.
We propose a framework based on contrastive learning to maximize mutual information between the two.
We demonstrate that our framework enables one-sided translation in the unpaired image-to-image translation setting, while improving quality and reducing training time.
arXiv Detail & Related papers (2020-07-30T17:59:58Z) - Unsupervised Landmark Learning from Unpaired Data [117.81440795184587]
Recent attempts for unsupervised landmark learning leverage synthesized image pairs that are similar in appearance but different in poses.
We propose a cross-image cycle consistency framework which applies the swapping-reconstruction strategy twice to obtain the final supervision.
Our proposed framework is shown to outperform strong baselines by a large margin.
arXiv Detail & Related papers (2020-06-29T13:57:20Z) - Un-Mix: Rethinking Image Mixtures for Unsupervised Visual Representation
Learning [108.999497144296]
Recently advanced unsupervised learning approaches use the siamese-like framework to compare two "views" from the same image for learning representations.
This work aims to involve the distance concept on label space in the unsupervised learning and let the model be aware of the soft degree of similarity between positive or negative pairs.
Despite its conceptual simplicity, we show empirically that with the solution -- Unsupervised image mixtures (Un-Mix), we can learn subtler, more robust and generalized representations from the transformed input and corresponding new label space.
arXiv Detail & Related papers (2020-03-11T17:59:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.