Representation Learning via Invariant Causal Mechanisms
- URL: http://arxiv.org/abs/2010.07922v1
- Date: Thu, 15 Oct 2020 17:53:37 GMT
- Title: Representation Learning via Invariant Causal Mechanisms
- Authors: Jovana Mitrovic, Brian McWilliams, Jacob Walker, Lars Buesing, Charles
Blundell
- Abstract summary: Self-supervised learning has emerged as a strategy to reduce the reliance on costly supervised signal by pretraining representations only using unlabeled data.
We show how data augmentations can be more effectively utilized through explicit invariance constraints on the proxy classifiers employed during pretraining.
We propose a novel self-supervised objective, Representation Learning via In Causvariantal Mechanisms (ReLIC) that enforces invariant prediction of proxy targets across augmentations.
- Score: 19.0976564154636
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Self-supervised learning has emerged as a strategy to reduce the reliance on
costly supervised signal by pretraining representations only using unlabeled
data. These methods combine heuristic proxy classification tasks with data
augmentations and have achieved significant success, but our theoretical
understanding of this success remains limited. In this paper we analyze
self-supervised representation learning using a causal framework. We show how
data augmentations can be more effectively utilized through explicit invariance
constraints on the proxy classifiers employed during pretraining. Based on
this, we propose a novel self-supervised objective, Representation Learning via
Invariant Causal Mechanisms (ReLIC), that enforces invariant prediction of
proxy targets across augmentations through an invariance regularizer which
yields improved generalization guarantees. Further, using causality we
generalize contrastive learning, a particular kind of self-supervised method,
and provide an alternative theoretical explanation for the success of these
methods. Empirically, ReLIC significantly outperforms competing methods in
terms of robustness and out-of-distribution generalization on ImageNet, while
also significantly outperforming these methods on Atari achieving above
human-level performance on $51$ out of $57$ games.
Related papers
- Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - Augmenting Unsupervised Reinforcement Learning with Self-Reference [63.68018737038331]
Humans possess the ability to draw on past experiences explicitly when learning new tasks.
We propose the Self-Reference (SR) approach, an add-on module explicitly designed to leverage historical information.
Our approach achieves state-of-the-art results in terms of Interquartile Mean (IQM) performance and Optimality Gap reduction on the Unsupervised Reinforcement Learning Benchmark.
arXiv Detail & Related papers (2023-11-16T09:07:34Z) - Generalizable Information Theoretic Causal Representation [37.54158138447033]
We propose to learn causal representation from observational data by regularizing the learning procedure with mutual information measures according to our hypothetical causal graph.
The optimization involves a counterfactual loss, based on which we deduce a theoretical guarantee that the causality-inspired learning is with reduced sample complexity and better generalization ability.
arXiv Detail & Related papers (2022-02-17T00:38:35Z) - Revisiting Consistency Regularization for Semi-Supervised Learning [80.28461584135967]
We propose an improved consistency regularization framework by a simple yet effective technique, FeatDistLoss.
Experimental results show that our model defines a new state of the art for various datasets and settings.
arXiv Detail & Related papers (2021-12-10T20:46:13Z) - Contrastive Learning for Fair Representations [50.95604482330149]
Trained classification models can unintentionally lead to biased representations and predictions.
Existing debiasing methods for classification models, such as adversarial training, are often expensive to train and difficult to optimise.
We propose a method for mitigating bias by incorporating contrastive learning, in which instances sharing the same class label are encouraged to have similar representations.
arXiv Detail & Related papers (2021-09-22T10:47:51Z) - A Simple but Tough-to-Beat Data Augmentation Approach for Natural
Language Understanding and Generation [53.8171136907856]
We introduce a set of simple yet effective data augmentation strategies dubbed cutoff.
cutoff relies on sampling consistency and thus adds little computational overhead.
cutoff consistently outperforms adversarial training and achieves state-of-the-art results on the IWSLT2014 German-English dataset.
arXiv Detail & Related papers (2020-09-29T07:08:35Z) - Query-Free Adversarial Transfer via Undertrained Surrogates [14.112444998191698]
We introduce a new method for improving the efficacy of adversarial attacks in a black-box setting by undertraining the surrogate model which the attacks are generated on.
We show that this method transfers well across architectures and outperforms state-of-the-art methods by a wide margin.
arXiv Detail & Related papers (2020-07-01T23:12:22Z) - Learning Diverse Representations for Fast Adaptation to Distribution
Shift [78.83747601814669]
We present a method for learning multiple models, incorporating an objective that pressures each to learn a distinct way to solve the task.
We demonstrate our framework's ability to facilitate rapid adaptation to distribution shift.
arXiv Detail & Related papers (2020-06-12T12:23:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.