Multi-Task Self-Training for Learning General Representations
- URL: http://arxiv.org/abs/2108.11353v1
- Date: Wed, 25 Aug 2021 17:20:50 GMT
- Title: Multi-Task Self-Training for Learning General Representations
- Authors: Golnaz Ghiasi, Barret Zoph, Ekin D. Cubuk, Quoc V. Le, Tsung-Yi Lin
- Abstract summary: Multi-task self-training (MuST) harnesses the knowledge in independent specialized teacher models to train a single general student model.
MuST is scalable with unlabeled or partially labeled datasets and outperforms both specialized supervised models and self-supervised models when training on large scale datasets.
- Score: 97.01728635294879
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Despite the fast progress in training specialized models for various tasks,
learning a single general model that works well for many tasks is still
challenging for computer vision. Here we introduce multi-task self-training
(MuST), which harnesses the knowledge in independent specialized teacher models
(e.g., ImageNet model on classification) to train a single general student
model. Our approach has three steps. First, we train specialized teachers
independently on labeled datasets. We then use the specialized teachers to
label an unlabeled dataset to create a multi-task pseudo labeled dataset.
Finally, the dataset, which now contains pseudo labels from teacher models
trained on different datasets/tasks, is then used to train a student model with
multi-task learning. We evaluate the feature representations of the student
model on 6 vision tasks including image recognition (classification, detection,
segmentation)and 3D geometry estimation (depth and surface normal estimation).
MuST is scalable with unlabeled or partially labeled datasets and outperforms
both specialized supervised models and self-supervised models when training on
large scale datasets. Lastly, we show MuST can improve upon already strong
checkpoints trained with billions of examples. The results suggest
self-training is a promising direction to aggregate labeled and unlabeled
training data for learning general feature representations.
Related papers
- Heuristic Vision Pre-Training with Self-Supervised and Supervised
Multi-Task Learning [0.0]
We propose a novel pre-training framework by adopting both self-supervised and supervised visual pre-text tasks in a multi-task manner.
Results show that our pre-trained models can deliver results on par with or better than state-of-the-art (SOTA) results on multiple visual tasks.
arXiv Detail & Related papers (2023-10-11T14:06:04Z) - Self-Training and Multi-Task Learning for Limited Data: Evaluation Study
on Object Detection [4.9914667450658925]
Experimental results show the improvement of performance when using a weak teacher with unseen data for training a multi-task student.
Despite the limited setup we believe the experimental results show the potential of multi-task knowledge distillation and self-training.
arXiv Detail & Related papers (2023-09-12T14:50:14Z) - Towards Large-scale 3D Representation Learning with Multi-dataset Point Prompt Training [44.790636524264]
Point Prompt Training is a novel framework for multi-dataset synergistic learning in the context of 3D representation learning.
It can overcome the negative transfer associated with synergistic learning and produce generalizable representations.
It achieves state-of-the-art performance on each dataset using a single weight-shared model with supervised multi-dataset training.
arXiv Detail & Related papers (2023-08-18T17:59:57Z) - JEDI: Joint Expert Distillation in a Semi-Supervised Multi-Dataset
Student-Teacher Scenario for Video Action Recognition [29.67402932890899]
We propose JEDI, a multi-dataset semi-supervised learning method.
It efficiently combines knowledge from multiple experts, learned on different datasets, to train and improve the performance of individual, per dataset, student models.
arXiv Detail & Related papers (2023-08-09T13:09:07Z) - An Efficient General-Purpose Modular Vision Model via Multi-Task
Heterogeneous Training [79.78201886156513]
We present a model that can perform multiple vision tasks and can be adapted to other downstream tasks efficiently.
Our approach achieves comparable results to single-task state-of-the-art models and demonstrates strong generalization on downstream tasks.
arXiv Detail & Related papers (2023-06-29T17:59:57Z) - Distilling Knowledge from Self-Supervised Teacher by Embedding Graph
Alignment [52.704331909850026]
We formulate a new knowledge distillation framework to transfer the knowledge from self-supervised pre-trained models to any other student network.
Inspired by the spirit of instance discrimination in self-supervised learning, we model the instance-instance relations by a graph formulation in the feature embedding space.
Our distillation scheme can be flexibly applied to transfer the self-supervised knowledge to enhance representation learning on various student networks.
arXiv Detail & Related papers (2022-11-23T19:27:48Z) - X-Learner: Learning Cross Sources and Tasks for Universal Visual
Representation [71.51719469058666]
We propose a representation learning framework called X-Learner.
X-Learner learns the universal feature of multiple vision tasks supervised by various sources.
X-Learner achieves strong performance on different tasks without extra annotations, modalities and computational costs.
arXiv Detail & Related papers (2022-03-16T17:23:26Z) - Diverse Complexity Measures for Dataset Curation in Self-driving [80.55417232642124]
We propose a new data selection method that exploits a diverse set of criteria that quantize interestingness of traffic scenes.
Our experiments show that the proposed curation pipeline is able to select datasets that lead to better generalization and higher performance.
arXiv Detail & Related papers (2021-01-16T23:45:02Z) - DiVA: Diverse Visual Feature Aggregation for Deep Metric Learning [83.48587570246231]
Visual Similarity plays an important role in many computer vision applications.
Deep metric learning (DML) is a powerful framework for learning such similarities.
We propose and study multiple complementary learning tasks, targeting conceptually different data relationships.
We learn a single model to aggregate their training signals, resulting in strong generalization and state-of-the-art performance.
arXiv Detail & Related papers (2020-04-28T12:26:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.