Tradeoffs Between Contrastive and Supervised Learning: An Empirical
Study
- URL: http://arxiv.org/abs/2112.05340v1
- Date: Fri, 10 Dec 2021 05:19:32 GMT
- Title: Tradeoffs Between Contrastive and Supervised Learning: An Empirical
Study
- Authors: Ananya Karthik, Mike Wu, Noah Goodman, Alex Tamkin
- Abstract summary: Contrastive learning has made considerable progress in computer vision, outperforming supervised pretraining on a range of downstream datasets.
We demonstrate two cases where it is not. First, under sufficiently small pretraining budgets, supervised pretraining on ImageNet consistently outperforms a comparable contrastive model on eight diverse image classification datasets.
Second, even with larger pretraining budgets we identify tasks where supervised learning prevails, perhaps because the object-centric bias of supervised pretraining makes the model more resilient to common corruptions and spurious foreground-background correlations.
- Score: 9.520526500374842
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Contrastive learning has made considerable progress in computer vision,
outperforming supervised pretraining on a range of downstream datasets.
However, is contrastive learning the better choice in all situations? We
demonstrate two cases where it is not. First, under sufficiently small
pretraining budgets, supervised pretraining on ImageNet consistently
outperforms a comparable contrastive model on eight diverse image
classification datasets. This suggests that the common practice of comparing
pretraining approaches at hundreds or thousands of epochs may not produce
actionable insights for those with more limited compute budgets. Second, even
with larger pretraining budgets we identify tasks where supervised learning
prevails, perhaps because the object-centric bias of supervised pretraining
makes the model more resilient to common corruptions and spurious
foreground-background correlations. These results underscore the need to
characterize tradeoffs of different pretraining objectives across a wider range
of contexts and training regimes.
Related papers
- On the Trade-off of Intra-/Inter-class Diversity for Supervised
Pre-training [72.8087629914444]
We study the impact of the trade-off between the intra-class diversity (the number of samples per class) and the inter-class diversity (the number of classes) of a supervised pre-training dataset.
With the size of the pre-training dataset fixed, the best downstream performance comes with a balance on the intra-/inter-class diversity.
arXiv Detail & Related papers (2023-05-20T16:23:50Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - An Empirical Investigation of the Role of Pre-training in Lifelong
Learning [21.995593026269578]
We show that generic pre-training implicitly alleviates the effects of catastrophic forgetting when learning multiple tasks sequentially.
We study this phenomenon by analyzing the loss landscape, finding that pre-trained weights appear to ease forgetting by leading to wider minima.
arXiv Detail & Related papers (2021-12-16T19:00:55Z) - Improved Fine-tuning by Leveraging Pre-training Data: Theory and
Practice [52.11183787786718]
Fine-tuning a pre-trained model on the target data is widely used in many deep learning applications.
Recent studies have empirically shown that training from scratch has the final performance that is no worse than this pre-training strategy.
We propose a novel selection strategy to select a subset from pre-training data to help improve the generalization on the target task.
arXiv Detail & Related papers (2021-11-24T06:18:32Z) - Rethinking supervised pre-training for better downstream transferring [46.09030708111374]
We propose a new supervised pre-training method based on Leave-One-Out K-Nearest-Neighbor, or LOOK.
It relieves the problem of overfitting upstream tasks by only requiring each image to share its class label with most of its k nearest neighbors.
We developed efficient implementation of the proposed method that scales well to large datasets.
arXiv Detail & Related papers (2021-10-12T13:57:38Z) - Contrastive Learning for Fair Representations [50.95604482330149]
Trained classification models can unintentionally lead to biased representations and predictions.
Existing debiasing methods for classification models, such as adversarial training, are often expensive to train and difficult to optimise.
We propose a method for mitigating bias by incorporating contrastive learning, in which instances sharing the same class label are encouraged to have similar representations.
arXiv Detail & Related papers (2021-09-22T10:47:51Z) - Imbalanced Adversarial Training with Reweighting [33.51820466479575]
We show that adversarially trained models can suffer much worse performance on under-represented classes, when the training dataset is imbalanced.
Traditional reweighting strategies may lose efficacy to deal with the imbalance issue for adversarial training.
We propose Separable Reweighted Adversarial Training (SRAT) to facilitate adversarial training under imbalanced scenarios.
arXiv Detail & Related papers (2021-07-28T20:51:36Z) - On the Robustness of Pretraining and Self-Supervision for a Deep
Learning-based Analysis of Diabetic Retinopathy [70.71457102672545]
We compare the impact of different training procedures for diabetic retinopathy grading.
We investigate different aspects such as quantitative performance, statistics of the learned feature representations, interpretability and robustness to image distortions.
Our results indicate that models from ImageNet pretraining report a significant increase in performance, generalization and robustness to image distortions.
arXiv Detail & Related papers (2021-06-25T08:32:45Z) - Robust Pre-Training by Adversarial Contrastive Learning [120.33706897927391]
Recent work has shown that, when integrated with adversarial training, self-supervised pre-training can lead to state-of-the-art robustness.
We improve robustness-aware self-supervised pre-training by learning representations consistent under both data augmentations and adversarial perturbations.
arXiv Detail & Related papers (2020-10-26T04:44:43Z) - Supervision Accelerates Pre-training in Contrastive Semi-Supervised
Learning of Visual Representations [12.755943669814236]
We propose a semi-supervised loss, SuNCEt, that aims to distinguish examples of different classes in addition to self-supervised instance-wise pretext tasks.
On ImageNet, we find that SuNCEt can be used to match the semi-supervised learning accuracy of previous contrastive approaches.
Our main insight is that leveraging even a small amount of labeled data during pre-training, and not only during fine-tuning, provides an important signal.
arXiv Detail & Related papers (2020-06-18T18:44:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.