Interventional Contrastive Learning with Meta Semantic Regularizer
- URL: http://arxiv.org/abs/2206.14702v1
- Date: Wed, 29 Jun 2022 15:02:38 GMT
- Title: Interventional Contrastive Learning with Meta Semantic Regularizer
- Authors: Wenwen Qiang, Jiangmeng Li, Changwen Zheng, Bing Su, Hui Xiong
- Abstract summary: Contrastive learning (CL)-based self-supervised learning models learn visual representations in a pairwise manner.
When the CL model is trained with full images, the performance tested in full images is better than that in foreground areas.
When the CL model is trained with foreground areas, the performance tested in full images is worse than that in foreground areas.
- Score: 28.708395209321846
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive learning (CL)-based self-supervised learning models learn visual
representations in a pairwise manner. Although the prevailing CL model has
achieved great progress, in this paper, we uncover an ever-overlooked
phenomenon: When the CL model is trained with full images, the performance
tested in full images is better than that in foreground areas; when the CL
model is trained with foreground areas, the performance tested in full images
is worse than that in foreground areas. This observation reveals that
backgrounds in images may interfere with the model learning semantic
information and their influence has not been fully eliminated. To tackle this
issue, we build a Structural Causal Model (SCM) to model the background as a
confounder. We propose a backdoor adjustment-based regularization method,
namely Interventional Contrastive Learning with Meta Semantic Regularizer
(ICL-MSR), to perform causal intervention towards the proposed SCM. ICL-MSR can
be incorporated into any existing CL methods to alleviate background
distractions from representation learning. Theoretically, we prove that ICL-MSR
achieves a tighter error bound. Empirically, our experiments on multiple
benchmark datasets demonstrate that ICL-MSR is able to improve the performances
of different state-of-the-art CL methods.
Related papers
- Theoretical Insights into Overparameterized Models in Multi-Task and Replay-Based Continual Learning [37.745896674964186]
Multi-task learning (MTL) aims to improve the generalization performance of a model on multiple related tasks by training it simultaneously on those tasks.
Continual learning (CL) involves adapting to new sequentially arriving tasks over time without forgetting the previously acquired knowledge.
We develop theoretical results describing the effect of various system parameters on the model's performance in an MTL setup.
Our results reveal the impact of buffer size and model capacity on the forgetting rate in a CL setup and help shed light on some of the state-of-the-art CL methods.
arXiv Detail & Related papers (2024-08-29T23:22:40Z) - Continual Panoptic Perception: Towards Multi-modal Incremental Interpretation of Remote Sensing Images [16.0258685984844]
Continual learning (CL) breaks off the one-way training manner and enables a model to adapt to new data, semantics and tasks continuously.
We propose a unified continual learning model that leverages multi-task joint learning covering pixel-level classification, instance-level segmentation and image-level perception.
arXiv Detail & Related papers (2024-07-19T12:22:32Z) - Data Poisoning for In-context Learning [49.77204165250528]
In-context learning (ICL) has been recognized for its innovative ability to adapt to new tasks.
This paper delves into the critical issue of ICL's susceptibility to data poisoning attacks.
We introduce ICLPoison, a specialized attacking framework conceived to exploit the learning mechanisms of ICL.
arXiv Detail & Related papers (2024-02-03T14:20:20Z) - In-context Learning and Gradient Descent Revisited [3.085927389171139]
We show that even untrained models achieve comparable ICL-GD similarity scores despite not exhibiting ICL.
Next, we explore a major discrepancy in the flow of information throughout the model between ICL and GD, which we term Layer Causality.
We propose a simple GD-based optimization procedure that respects layer causality, and show it improves similarity scores significantly.
arXiv Detail & Related papers (2023-11-13T21:42:38Z) - Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP [84.90129481336659]
We study transferrable representation learning underlying CLIP and demonstrate how features from different modalities get aligned.
Inspired by our analysis, we propose a new CLIP-type approach, which achieves better performance than CLIP and other state-of-the-art methods on benchmark datasets.
arXiv Detail & Related papers (2023-10-02T06:41:30Z) - Continual Vision-Language Representation Learning with Off-Diagonal
Information [112.39419069447902]
Multi-modal contrastive learning frameworks like CLIP typically require a large amount of image-text samples for training.
This paper discusses the feasibility of continual CLIP training using streaming data.
arXiv Detail & Related papers (2023-05-11T08:04:46Z) - Non-Contrastive Learning Meets Language-Image Pre-Training [145.6671909437841]
We study the validity of non-contrastive language-image pre-training (nCLIP)
We introduce xCLIP, a multi-tasking framework combining CLIP and nCLIP, and show that nCLIP aids CLIP in enhancing feature semantics.
arXiv Detail & Related papers (2022-10-17T17:57:46Z) - Learning Deep Representations via Contrastive Learning for Instance
Retrieval [11.736450745549792]
This paper makes the first attempt that tackles the problem using instance-discrimination based contrastive learning (CL)
In this work, we approach this problem by exploring the capability of deriving discriminative representations from pre-trained and fine-tuned CL models.
arXiv Detail & Related papers (2022-09-28T04:36:34Z) - Using Representation Expressiveness and Learnability to Evaluate
Self-Supervised Learning Methods [61.49061000562676]
We introduce Cluster Learnability (CL) to assess learnability.
CL is measured in terms of the performance of a KNN trained to predict labels obtained by clustering the representations with K-means.
We find that CL better correlates with in-distribution model performance than other competing recent evaluation schemes.
arXiv Detail & Related papers (2022-06-02T19:05:13Z) - ReCLIP: A Strong Zero-Shot Baseline for Referring Expression
Comprehension [114.85628613911713]
Large-scale pre-trained models are useful for image classification across domains.
We present ReCLIP, a simple but strong zero-shot baseline that repurposes CLIP, a state-of-the-art large-scale model, for ReC.
arXiv Detail & Related papers (2022-04-12T17:55:38Z) - A Comprehensive Empirical Study of Vision-Language Pre-trained Model for
Supervised Cross-Modal Retrieval [19.2650103482509]
Cross-Modal Retrieval (CMR) is an important research topic across multimodal computing and information retrieval.
We take CLIP as the current representative vision-language pre-trained model to conduct a comprehensive empirical study.
We propose a novel model CLIP4CMR that employs pre-trained CLIP as backbone network to perform supervised CMR.
arXiv Detail & Related papers (2022-01-08T06:00:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.