Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"
- URL: http://arxiv.org/abs/2412.16522v1
- Date: Sat, 21 Dec 2024 07:50:59 GMT
- Title: Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"
- Authors: Yudong Zhang, Ruobing Xie, Jiansheng Chen, Xingwu Sun, Zhanhui Kang, Yu Wang,
- Abstract summary: We introduce JointCrop and JointBlur to generate challenging positive pairs in contrastive learning.
As a plug-and-play framework, JointCrop and JointBlur enhance the performance of SimCLR, BYOL, MoCo v1, MoCo v2, MoCo v3, SimSiam, and Dino baselines.
- Score: 29.19909246476688
- License:
- Abstract: Contrastive learning is a prevalent technique in self-supervised vision representation learning, typically generating positive pairs by applying two data augmentations to the same image. Designing effective data augmentation strategies is crucial for the success of contrastive learning. Inspired by the story of the blind men and the elephant, we introduce JointCrop and JointBlur. These methods generate more challenging positive pairs by leveraging the joint distribution of the two augmentation parameters, thereby enabling contrastive learning to acquire more effective feature representations. To the best of our knowledge, this is the first effort to explicitly incorporate the joint distribution of two data augmentation parameters into contrastive learning. As a plug-and-play framework without additional computational overhead, JointCrop and JointBlur enhance the performance of SimCLR, BYOL, MoCo v1, MoCo v2, MoCo v3, SimSiam, and Dino baselines with notable improvements.
Related papers
- Feature Augmentation for Self-supervised Contrastive Learning: A Closer Look [28.350278251132078]
We propose a unified framework to conduct data augmentation in the feature space, known as feature augmentation.
This strategy is domain-agnostic, which augments similar features to the original ones and thus improves the data diversity.
arXiv Detail & Related papers (2024-10-16T09:25:11Z) - TwinCL: A Twin Graph Contrastive Learning Model for Collaborative Filtering [20.26347686022996]
We introduce a twin encoder in place of random augmentations, demonstrating the redundancy of traditional augmentation techniques.
Our proposed Twin Graph Contrastive Learning model -- TwinCL -- aligns positive pairs of user and item embeddings and the representations from the twin encoder.
Our theoretical analysis and experimental results show that the proposed model contributes to better recommendation accuracy and training efficiency performance.
arXiv Detail & Related papers (2024-09-27T22:31:08Z) - Dual-perspective Cross Contrastive Learning in Graph Transformers [33.18813968554711]
Graph contrastive learning (GCL) is a popular method for leaning graph representations.
This paper proposes a framework termed dual-perspective cross graph contrastive learning (DC-GCL)
DC-GCL incorporates three modifications designed to enhance positive sample diversity and reliability.
arXiv Detail & Related papers (2024-06-01T11:11:49Z) - Visual Commonsense based Heterogeneous Graph Contrastive Learning [79.22206720896664]
We propose a heterogeneous graph contrastive learning method to better finish the visual reasoning task.
Our method is designed as a plug-and-play way, so that it can be quickly and easily combined with a wide range of representative methods.
arXiv Detail & Related papers (2023-11-11T12:01:18Z) - GraphLearner: Graph Node Clustering with Fully Learnable Augmentation [76.63963385662426]
Contrastive deep graph clustering (CDGC) leverages the power of contrastive learning to group nodes into different clusters.
We propose a Graph Node Clustering with Fully Learnable Augmentation, termed GraphLearner.
It introduces learnable augmentors to generate high-quality and task-specific augmented samples for CDGC.
arXiv Detail & Related papers (2022-12-07T10:19:39Z) - Hierarchical Consistent Contrastive Learning for Skeleton-Based Action
Recognition with Growing Augmentations [33.68311764817763]
We propose a general hierarchical consistent contrastive learning framework (HiCLR) for skeleton-based action recognition.
Specifically, we first design a gradual growing augmentation policy to generate multiple ordered positive pairs.
Then, an asymmetric loss is proposed to enforce the hierarchical consistency via a directional clustering operation.
arXiv Detail & Related papers (2022-11-24T08:09:50Z) - Semantics-Depth-Symbiosis: Deeply Coupled Semi-Supervised Learning of
Semantics and Depth [83.94528876742096]
We tackle the MTL problem of two dense tasks, ie, semantic segmentation and depth estimation, and present a novel attention module called Cross-Channel Attention Module (CCAM)
In a true symbiotic spirit, we then formulate a novel data augmentation for the semantic segmentation task using predicted depth called AffineMix, and a simple depth augmentation using predicted semantics called ColorAug.
Finally, we validate the performance gain of the proposed method on the Cityscapes dataset, which helps us achieve state-of-the-art results for a semi-supervised joint model based on depth and semantic
arXiv Detail & Related papers (2022-06-21T17:40:55Z) - Weak Augmentation Guided Relational Self-Supervised Learning [80.0680103295137]
We introduce a novel relational self-supervised learning (ReSSL) framework that learns representations by modeling the relationship between different instances.
Our proposed method employs sharpened distribution of pairwise similarities among different instances as textitrelation metric.
Experimental results show that our proposed ReSSL substantially outperforms the state-of-the-art methods across different network architectures.
arXiv Detail & Related papers (2022-03-16T16:14:19Z) - ParamCrop: Parametric Cubic Cropping for Video Contrastive Learning [35.577788907544964]
We present a parametric cubic cropping operation, ParamCrop, for video contrastive learning.
ParamCrop is trained simultaneously with the video backbone using an adversarial objective and learns an optimal cropping strategy from the data.
visualizations show that the center distance and the IoU between two augmented views are adaptively controlled by ParamCrop.
arXiv Detail & Related papers (2021-08-24T03:18:12Z) - Heterogeneous Contrastive Learning: Encoding Spatial Information for
Compact Visual Representations [183.03278932562438]
This paper presents an effective approach that adds spatial information to the encoding stage to alleviate the learning inconsistency between the contrastive objective and strong data augmentation operations.
We show that our approach achieves higher efficiency in visual representations and thus delivers a key message to inspire the future research of self-supervised visual representation learning.
arXiv Detail & Related papers (2020-11-19T16:26:25Z) - CoDA: Contrast-enhanced and Diversity-promoting Data Augmentation for
Natural Language Understanding [67.61357003974153]
We propose a novel data augmentation framework dubbed CoDA.
CoDA synthesizes diverse and informative augmented examples by integrating multiple transformations organically.
A contrastive regularization objective is introduced to capture the global relationship among all the data samples.
arXiv Detail & Related papers (2020-10-16T23:57:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.