Related papers: What makes instance discrimination good for transfer learning?

What makes instance discrimination good for transfer learning?

URL: http://arxiv.org/abs/2006.06606v2
Date: Tue, 19 Jan 2021 15:45:44 GMT
Title: What makes instance discrimination good for transfer learning?
Authors: Nanxuan Zhao and Zhirong Wu and Rynson W.H. Lau and Stephen Lin
Abstract summary: We investigate what makes instance discrimination pretraining good for transfer learning. What really matters for the transfer is low-level and mid-level representations, not high-level representations. supervised pretraining can be strengthened by following an exemplar-based approach.
Score: 82.79808902674282
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Contrastive visual pretraining based on the instance discrimination pretext task has made significant progress. Notably, recent work on unsupervised pretraining has shown to surpass the supervised counterpart for finetuning downstream applications such as object detection and segmentation. It comes as a surprise that image annotations would be better left unused for transfer learning. In this work, we investigate the following problems: What makes instance discrimination pretraining good for transfer learning? What knowledge is actually learned and transferred from these models? From this understanding of instance discrimination, how can we better exploit human annotation labels for pretraining? Our findings are threefold. First, what truly matters for the transfer is low-level and mid-level representations, not high-level representations. Second, the intra-category invariance enforced by the traditional supervised model weakens transferability by increasing task misalignment. Finally, supervised pretraining can be strengthened by following an exemplar-based approach without explicit constraints among the instances within the same category.

Related papers

Weakly-Supervised Contrastive Learning for Imprecise Class Labels [50.57424331797865]
We introduce the concept of continuous semantic similarity'' to define positive and negative pairs.<n>We propose a graph-theoretic framework for weakly-supervised contrastive learning.<n>Our framework is highly versatile and can be applied to many weakly-supervised learning scenarios.
arXiv Detail & Related papers (2025-05-28T06:50:40Z)
Discriminability-Transferability Trade-Off: An Information-Theoretic Perspective [17.304811383730417]
This work simultaneously considers the discriminability and transferability properties of deep representations in the typical supervised learning task. From the perspective of information-bottleneck theory, we reveal that the incompatibility between discriminability and transferability is attributed to the over-compression of input information.
arXiv Detail & Related papers (2022-03-08T06:16:33Z)
Agree to Disagree: Diversity through Disagreement for Better Transferability [54.308327969778155]
We propose D-BAT (Diversity-By-disAgreement Training), which enforces agreement among the models on the training data. We show how D-BAT naturally emerges from the notion of generalized discrepancy.
arXiv Detail & Related papers (2022-02-09T12:03:02Z)
Improving Transferability of Representations via Augmentation-Aware Self-Supervision [117.15012005163322]
AugSelf is an auxiliary self-supervised loss that learns the difference of augmentation parameters between two randomly augmented samples. Our intuition is that AugSelf encourages to preserve augmentation-aware information in learned representations, which could be beneficial for their transferability. AugSelf can easily be incorporated into recent state-of-the-art representation learning methods with a negligible additional training cost.
arXiv Detail & Related papers (2021-11-18T10:43:50Z)
Does Pretraining for Summarization Require Knowledge Transfer? [27.297137706355173]
We show that pretraining on character n-grams selected at random can nearly match the performance of models pretrained on real corpora. This work holds the promise of eliminating upstream corpora, which may alleviate some concerns over offensive language, bias, and copyright issues.
arXiv Detail & Related papers (2021-09-10T15:54:15Z)
Improve Unsupervised Pretraining for Few-label Transfer [80.58625921631506]
In this paper, we find this conclusion may not hold when the target dataset has very few labeled samples for finetuning. We propose a new progressive few-label transfer algorithm for real applications.
arXiv Detail & Related papers (2021-07-26T17:59:56Z)
Training GANs with Stronger Augmentations via Contrastive Discriminator [80.8216679195]
We introduce a contrastive representation learning scheme into the GAN discriminator, coined ContraD. This "fusion" enables the discriminators to work with much stronger augmentations without increasing their training instability. Our experimental results show that GANs with ContraD consistently improve FID and IS compared to other recent techniques incorporating data augmentations.
arXiv Detail & Related papers (2021-03-17T16:04:54Z)
Are Fewer Labels Possible for Few-shot Learning? [81.89996465197392]
Few-shot learning is challenging due to its very limited data and labels. Recent studies in big transfer (BiT) show that few-shot learning can greatly benefit from pretraining on large scale labeled dataset in a different domain. We propose eigen-finetuning to enable fewer shot learning by leveraging the co-evolution of clustering and eigen-samples in the finetuning.
arXiv Detail & Related papers (2020-12-10T18:59:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.