Discriminability-Transferability Trade-Off: An Information-Theoretic
Perspective
- URL: http://arxiv.org/abs/2203.03871v1
- Date: Tue, 8 Mar 2022 06:16:33 GMT
- Title: Discriminability-Transferability Trade-Off: An Information-Theoretic
Perspective
- Authors: Quan Cui, Bingchen Zhao, Zhao-Min Chen, Borui Zhao, Renjie Song,
Jiajun Liang, Boyan Zhou, Osamu Yoshie
- Abstract summary: This work simultaneously considers the discriminability and transferability properties of deep representations in the typical supervised learning task.
From the perspective of information-bottleneck theory, we reveal that the incompatibility between discriminability and transferability is attributed to the over-compression of input information.
- Score: 17.304811383730417
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work simultaneously considers the discriminability and transferability
properties of deep representations in the typical supervised learning task,
i.e., image classification. By a comprehensive temporal analysis, we observe a
trade-off between these two properties. The discriminability keeps increasing
with the training progressing while the transferability intensely diminishes in
the later training period.
From the perspective of information-bottleneck theory, we reveal that the
incompatibility between discriminability and transferability is attributed to
the over-compression of input information. More importantly, we investigate why
and how the InfoNCE loss can alleviate the over-compression, and further
present a learning framework, named contrastive temporal coding~(CTC), to
counteract the over-compression and alleviate the incompatibility. Extensive
experiments validate that CTC successfully mitigates the incompatibility,
yielding discriminative and transferable representations. Noticeable
improvements are achieved on the image classification task and challenging
transfer learning tasks. We hope that this work will raise the significance of
the transferability property in the conventional supervised learning setting.
Code will be publicly available.
Related papers
- Disentangling and Mitigating the Impact of Task Similarity for Continual Learning [1.3597551064547502]
Continual learning of partially similar tasks poses a challenge for artificial neural networks.
High input feature similarity coupled with low readout similarity is catastrophic for both knowledge transfer and retention.
Weight regularization based on the Fisher information metric significantly improves retention, regardless of task similarity.
arXiv Detail & Related papers (2024-05-30T16:40:07Z) - Evaluating the structure of cognitive tasks with transfer learning [67.22168759751541]
This study investigates the transferability of deep learning representations between different EEG decoding tasks.
We conduct extensive experiments using state-of-the-art decoding models on two recently released EEG datasets.
arXiv Detail & Related papers (2023-07-28T14:51:09Z) - On Higher Adversarial Susceptibility of Contrastive Self-Supervised
Learning [104.00264962878956]
Contrastive self-supervised learning (CSL) has managed to match or surpass the performance of supervised learning in image and video classification.
It is still largely unknown if the nature of the representation induced by the two learning paradigms is similar.
We identify the uniform distribution of data representation over a unit hypersphere in the CSL representation space as the key contributor to this phenomenon.
We devise strategies that are simple, yet effective in improving model robustness with CSL training.
arXiv Detail & Related papers (2022-07-22T03:49:50Z) - Fair Contrastive Learning for Facial Attribute Classification [25.436462696033846]
We propose a new Fair Supervised Contrastive Loss (FSCL) for fair visual representation learning.
In this paper, we for the first time analyze unfairness caused by supervised contrastive learning.
Our method is robust to the intensity of data bias and effectively works in incomplete supervised settings.
arXiv Detail & Related papers (2022-03-30T11:16:18Z) - Simpler, Faster, Stronger: Breaking The log-K Curse On Contrastive
Learners With FlatNCE [104.37515476361405]
We reveal mathematically why contrastive learners fail in the small-batch-size regime.
We present a novel non-native contrastive objective named FlatNCE, which fixes this issue.
arXiv Detail & Related papers (2021-07-02T15:50:43Z) - Training GANs with Stronger Augmentations via Contrastive Discriminator [80.8216679195]
We introduce a contrastive representation learning scheme into the GAN discriminator, coined ContraD.
This "fusion" enables the discriminators to work with much stronger augmentations without increasing their training instability.
Our experimental results show that GANs with ContraD consistently improve FID and IS compared to other recent techniques incorporating data augmentations.
arXiv Detail & Related papers (2021-03-17T16:04:54Z) - Spatial Contrastive Learning for Few-Shot Classification [9.66840768820136]
We propose a novel attention-based spatial contrastive objective to learn locally discriminative and class-agnostic features.
With extensive experiments, we show that the proposed method outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2020-12-26T23:39:41Z) - Adversarial Training Reduces Information and Improves Transferability [81.59364510580738]
Recent results show that features of adversarially trained networks for classification, in addition to being robust, enable desirable properties such as invertibility.
We show that the Adversarial Training can improve linear transferability to new tasks, from which arises a new trade-off between transferability of representations and accuracy on the source task.
arXiv Detail & Related papers (2020-07-22T08:30:16Z) - What makes instance discrimination good for transfer learning? [82.79808902674282]
We investigate what makes instance discrimination pretraining good for transfer learning.
What really matters for the transfer is low-level and mid-level representations, not high-level representations.
supervised pretraining can be strengthened by following an exemplar-based approach.
arXiv Detail & Related papers (2020-06-11T16:55:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.