Related papers: Self-supervised Knowledge Distillation for Few-shot Learning

Self-supervised Knowledge Distillation for Few-shot Learning

URL: http://arxiv.org/abs/2006.09785v2
Date: Tue, 4 Aug 2020 05:22:39 GMT
Title: Self-supervised Knowledge Distillation for Few-shot Learning
Authors: Jathushan Rajasegaran, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Mubarak Shah
Abstract summary: Few shot learning is a promising learning paradigm due to its ability to learn out of order distributions quickly with only a few samples. We propose a simple approach to improve the representation capacity of deep neural networks for few-shot learning tasks. Our experiments show that, even in the first stage, self-supervision can outperform current state-of-the-art methods.
Score: 123.10294801296926
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Real-world contains an overwhelmingly large number of object classes, learning all of which at once is infeasible. Few shot learning is a promising learning paradigm due to its ability to learn out of order distributions quickly with only a few samples. Recent works [7, 41] show that simply learning a good feature embedding can outperform more sophisticated meta-learning and metric learning algorithms for few-shot learning. In this paper, we propose a simple approach to improve the representation capacity of deep neural networks for few-shot learning tasks. We follow a two-stage learning process: First, we train a neural network to maximize the entropy of the feature embedding, thus creating an optimal output manifold using a self-supervised auxiliary loss. In the second stage, we minimize the entropy on feature embedding by bringing self-supervised twins together, while constraining the manifold with student-teacher distillation. Our experiments show that, even in the first stage, self-supervision can outperform current state-of-the-art methods, with further gains achieved by our second stage distillation process. Our codes are available at: https://github.com/brjathu/SKD.

Related papers

MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments [72.6405488990753]
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks. We propose a single-stage and standalone method, MOCA, which unifies both desired properties. We achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols.
arXiv Detail & Related papers (2023-07-18T15:46:20Z)
PIVOT: Prompting for Video Continual Learning [50.80141083993668]
We introduce PIVOT, a novel method that leverages extensive knowledge in pre-trained models from the image domain. Our experiments show that PIVOT improves state-of-the-art methods by a significant 27% on the 20-task ActivityNet setup.
arXiv Detail & Related papers (2022-12-09T13:22:27Z)
EfficientTrain: Exploring Generalized Curriculum Learning for Training Visual Backbones [80.662250618795]
This paper presents a new curriculum learning approach for the efficient training of visual backbones (e.g., vision Transformers) As an off-the-shelf method, it reduces the wall-time training cost of a wide variety of popular models by >1.5x on ImageNet-1K/22K without sacrificing accuracy.
arXiv Detail & Related papers (2022-11-17T17:38:55Z)
SSMTL++: Revisiting Self-Supervised Multi-Task Learning for Video Anomaly Detection [108.57862846523858]
We revisit the self-supervised multi-task learning framework, proposing several updates to the original method. We modernize the 3D convolutional backbone by introducing multi-head self-attention modules. In our attempt to further improve the model, we study additional self-supervised learning tasks, such as predicting segmentation maps.
arXiv Detail & Related papers (2022-07-16T19:25:41Z)
Learning Deep Representation with Energy-Based Self-Expressiveness for Subspace Clustering [24.311754971064303]
We propose a new deep subspace clustering framework, motivated by the energy-based models. Considering the powerful representation ability of the recently popular self-supervised learning, we attempt to leverage self-supervised representation learning to learn the dictionary.
arXiv Detail & Related papers (2021-10-28T11:51:08Z)
Distill on the Go: Online knowledge distillation in self-supervised learning [1.1470070927586016]
Recent works have shown that wider and deeper models benefit more from self-supervised learning than smaller models. We propose Distill-on-the-Go (DoGo), a self-supervised learning paradigm using single-stage online knowledge distillation. Our results show significant performance gain in the presence of noisy and limited labels.
arXiv Detail & Related papers (2021-04-20T09:59:23Z)
Self-Supervised Training Enhances Online Continual Learning [37.91734641808391]
In continual learning, a system must incrementally learn from a non-stationary data stream without catastrophic forgetting. Self-supervised pre-training could yield features that generalize better than supervised learning. Our best system achieves a 14.95% relative increase in top-1 accuracy on class incremental ImageNet over the prior state of the art for online continual learning.
arXiv Detail & Related papers (2021-03-25T17:45:27Z)
Building One-Shot Semi-supervised (BOSS) Learning up to Fully Supervised Performance [0.0]
We show the potential for building one-shot semi-supervised (BOSS) learning on Cifar-10 and SVHN. Our method combines class prototype refining, class balancing, and self-training. Rigorous empirical evaluations provide evidence that labeling large datasets is not necessary for training deep neural networks.
arXiv Detail & Related papers (2020-06-16T17:56:00Z)
Rethinking Few-Shot Image Classification: a Good Embedding Is All You Need? [72.00712736992618]
We show that a simple baseline: learning a supervised or self-supervised representation on the meta-training set, outperforms state-of-the-art few-shot learning methods. An additional boost can be achieved through the use of self-distillation. We believe that our findings motivate a rethinking of few-shot image classification benchmarks and the associated role of meta-learning algorithms.
arXiv Detail & Related papers (2020-03-25T17:58:42Z)

This list is automatically generated from the titles and abstracts of the papers in this site.