CompRess: Self-Supervised Learning by Compressing Representations
- URL: http://arxiv.org/abs/2010.14713v1
- Date: Wed, 28 Oct 2020 02:49:18 GMT
- Title: CompRess: Self-Supervised Learning by Compressing Representations
- Authors: Soroush Abbasi Koohpayegani, Ajinkya Tejankar, and Hamed Pirsiavash
- Abstract summary: We develop a model compression method to compress an already learned, deep self-supervised model (teacher) to a smaller one (student)
We train the student model so that it mimics the relative similarity between the data points in the teacher's embedding space.
This is the first time a self-supervised AlexNet has outperformed supervised one on ImageNet classification.
- Score: 14.739041141948032
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Self-supervised learning aims to learn good representations with unlabeled
data. Recent works have shown that larger models benefit more from
self-supervised learning than smaller models. As a result, the gap between
supervised and self-supervised learning has been greatly reduced for larger
models. In this work, instead of designing a new pseudo task for
self-supervised learning, we develop a model compression method to compress an
already learned, deep self-supervised model (teacher) to a smaller one
(student). We train the student model so that it mimics the relative similarity
between the data points in the teacher's embedding space. For AlexNet, our
method outperforms all previous methods including the fully supervised model on
ImageNet linear evaluation (59.0% compared to 56.5%) and on nearest neighbor
evaluation (50.7% compared to 41.4%). To the best of our knowledge, this is the
first time a self-supervised AlexNet has outperformed supervised one on
ImageNet classification. Our code is available here:
https://github.com/UMBCvision/CompRess
Related papers
- Establishing a stronger baseline for lightweight contrastive models [10.63129923292905]
Recent research has reported a performance degradation in self-supervised contrastive learning for specially designed efficient networks.
A common practice is to introduce a pretrained contrastive teacher model and train the lightweight networks with distillation signals generated by the teacher.
In this work, we aim to establish a stronger baseline for lightweight contrastive models without using a pretrained teacher model.
arXiv Detail & Related papers (2022-12-14T11:20:24Z) - Distilling Knowledge from Self-Supervised Teacher by Embedding Graph
Alignment [52.704331909850026]
We formulate a new knowledge distillation framework to transfer the knowledge from self-supervised pre-trained models to any other student network.
Inspired by the spirit of instance discrimination in self-supervised learning, we model the instance-instance relations by a graph formulation in the feature embedding space.
Our distillation scheme can be flexibly applied to transfer the self-supervised knowledge to enhance representation learning on various student networks.
arXiv Detail & Related papers (2022-11-23T19:27:48Z) - Part-Based Models Improve Adversarial Robustness [57.699029966800644]
We show that combining human prior knowledge with end-to-end learning can improve the robustness of deep neural networks.
Our model combines a part segmentation model with a tiny classifier and is trained end-to-end to simultaneously segment objects into parts.
Our experiments indicate that these models also reduce texture bias and yield better robustness against common corruptions and spurious correlations.
arXiv Detail & Related papers (2022-09-15T15:41:47Z) - Network Augmentation for Tiny Deep Learning [73.57192520534585]
We introduce Network Augmentation (NetAug), a new training method for improving the performance of tiny neural networks.
We demonstrate the effectiveness of NetAug on image classification and object detection.
arXiv Detail & Related papers (2021-10-17T18:48:41Z) - Unsupervised Representation Learning for 3D Point Cloud Data [66.92077180228634]
We propose a simple yet effective approach for unsupervised point cloud learning.
In particular, we identify a very useful transformation which generates a good contrastive version of an original point cloud.
We conduct experiments on three downstream tasks which are 3D object classification, shape part segmentation and scene segmentation.
arXiv Detail & Related papers (2021-10-13T10:52:45Z) - Bag of Instances Aggregation Boosts Self-supervised Learning [122.61914701794296]
We propose a simple but effective distillation strategy for unsupervised learning.
Our method, termed as BINGO, targets at transferring the relationship learned by the teacher to the student.
BINGO achieves new state-of-the-art performance on small scale models.
arXiv Detail & Related papers (2021-07-04T17:33:59Z) - Distill on the Go: Online knowledge distillation in self-supervised
learning [1.1470070927586016]
Recent works have shown that wider and deeper models benefit more from self-supervised learning than smaller models.
We propose Distill-on-the-Go (DoGo), a self-supervised learning paradigm using single-stage online knowledge distillation.
Our results show significant performance gain in the presence of noisy and limited labels.
arXiv Detail & Related papers (2021-04-20T09:59:23Z) - DisCo: Remedy Self-supervised Learning on Lightweight Models with
Distilled Contrastive Learning [94.89221799550593]
Self-supervised representation learning (SSL) has received widespread attention from the community.
Recent research argue that its performance will suffer a cliff fall when the model size decreases.
We propose a simple yet effective Distilled Contrastive Learning (DisCo) to ease the issue by a large margin.
arXiv Detail & Related papers (2021-04-19T08:22:52Z) - SEED: Self-supervised Distillation For Visual Representation [34.63488756535054]
We propose a new learning paradigm, named SElf-SupErvised Distillation (SEED), to transfer its representational knowledge into a smaller architecture (as Student) in a self-supervised fashion.
We show that SEED dramatically boosts the performance of small networks on downstream tasks.
arXiv Detail & Related papers (2021-01-12T20:04:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.