Can contrastive learning avoid shortcut solutions?
- URL: http://arxiv.org/abs/2106.11230v1
- Date: Mon, 21 Jun 2021 16:22:43 GMT
- Title: Can contrastive learning avoid shortcut solutions?
- Authors: Joshua Robinson, Li Sun, Ke Yu, Kayhan Batmanghelich, Stefanie
Jegelka, Suvrit Sra
- Abstract summary: implicit feature modification (IFM) is a method for altering positive and negative samples in order to guide contrastive models towards capturing a wider variety of predictive features.
IFM reduces feature suppression, and as a result improves performance on vision and medical imaging tasks.
- Score: 88.249082564465
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The generalization of representations learned via contrastive learning
depends crucially on what features of the data are extracted. However, we
observe that the contrastive loss does not always sufficiently guide which
features are extracted, a behavior that can negatively impact the performance
on downstream tasks via "shortcuts", i.e., by inadvertently suppressing
important predictive features. We find that feature extraction is influenced by
the difficulty of the so-called instance discrimination task (i.e., the task of
discriminating pairs of similar points from pairs of dissimilar ones). Although
harder pairs improve the representation of some features, the improvement comes
at the cost of suppressing previously well represented features. In response,
we propose implicit feature modification (IFM), a method for altering positive
and negative samples in order to guide contrastive models towards capturing a
wider variety of predictive features. Empirically, we observe that IFM reduces
feature suppression, and as a result improves performance on vision and medical
imaging tasks. The code is available at: \url{https://github.com/joshr17/IFM}.
Related papers
- Spurious Feature Eraser: Stabilizing Test-Time Adaptation for Vision-Language Foundation Model [86.9619638550683]
Vision-language foundation models have exhibited remarkable success across a multitude of downstream tasks due to their scalability on extensive image-text paired data.
However, these models display significant limitations when applied to downstream tasks, such as fine-grained image classification, as a result of decision shortcuts''
arXiv Detail & Related papers (2024-03-01T09:01:53Z) - Amortised Invariance Learning for Contrastive Self-Supervision [11.042648980854485]
We introduce the notion of amortised invariance learning for contrastive self supervision.
We show that our amortised features provide a reliable way to learn diverse downstream tasks with different invariance requirements.
This provides an exciting perspective that opens up new horizons in the field of general purpose representation learning.
arXiv Detail & Related papers (2023-02-24T16:15:11Z) - CIPER: Combining Invariant and Equivariant Representations Using
Contrastive and Predictive Learning [6.117084972237769]
We introduce Contrastive Invariant and Predictive Equivariant Representation learning (CIPER)
CIPER comprises both invariant and equivariant learning objectives using one shared encoder and two different output heads on top of the encoder.
We evaluate our method on static image tasks and time-augmented image datasets.
arXiv Detail & Related papers (2023-02-05T07:50:46Z) - MetAug: Contrastive Learning via Meta Feature Augmentation [28.708395209321846]
We argue that contrastive learning heavily relies on informative features, or "hard" (positive or negative) features.
The key challenge toward exploring such features is that the source multi-view data is generated by applying random data augmentations.
We propose to directly augment the features in latent space, thereby learning discriminative representations without a large amount of input data.
arXiv Detail & Related papers (2022-03-10T02:35:39Z) - Improving Transferability of Representations via Augmentation-Aware
Self-Supervision [117.15012005163322]
AugSelf is an auxiliary self-supervised loss that learns the difference of augmentation parameters between two randomly augmented samples.
Our intuition is that AugSelf encourages to preserve augmentation-aware information in learned representations, which could be beneficial for their transferability.
AugSelf can easily be incorporated into recent state-of-the-art representation learning methods with a negligible additional training cost.
arXiv Detail & Related papers (2021-11-18T10:43:50Z) - Improving Contrastive Learning by Visualizing Feature Transformation [37.548120912055595]
In this paper, we attempt to devise a feature-level data manipulation, differing from data augmentation, to enhance the generic contrastive self-supervised learning.
We first design a visualization scheme for pos/neg score (Pos/neg score indicates similarity of pos/neg pair.) distribution, which enables us to analyze, interpret and understand the learning process.
Experiment results show that our proposed Feature Transformation can improve at least 6.0% accuracy on ImageNet-100 over MoCo baseline, and about 2.0% accuracy on ImageNet-1K over the MoCoV2 baseline.
arXiv Detail & Related papers (2021-08-06T07:26:08Z) - Investigating the Role of Negatives in Contrastive Representation
Learning [59.30700308648194]
Noise contrastive learning is a popular technique for unsupervised representation learning.
We focus on disambiguating the role of one of these parameters: the number of negative examples.
We find that the results broadly agree with our theory, while our vision experiments are murkier with performance sometimes even being insensitive to the number of negatives.
arXiv Detail & Related papers (2021-06-18T06:44:16Z) - What Should Not Be Contrastive in Contrastive Learning [110.14159883496859]
We introduce a contrastive learning framework which does not require prior knowledge of specific, task-dependent invariances.
Our model learns to capture varying and invariant factors for visual representations by constructing separate embedding spaces.
We use a multi-head network with a shared backbone which captures information across each augmentation and alone outperforms all baselines on downstream tasks.
arXiv Detail & Related papers (2020-08-13T03:02:32Z) - Disentanglement for Discriminative Visual Recognition [7.954325638519141]
This chapter systematically summarize the detrimental factors as task-relevant/irrelevant semantic variations and unspecified latent variation.
The better FER performance can be achieved by combining the deep metric loss and softmax loss in a unified two fully connected layer branches framework.
The framework achieves top performance on a serial of tasks, including lighting, makeup, disguise-tolerant face recognition and facial attributes recognition.
arXiv Detail & Related papers (2020-06-14T06:10:51Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.