Universal Representations: A Unified Look at Multiple Task and Domain
Learning
- URL: http://arxiv.org/abs/2204.02744v1
- Date: Wed, 6 Apr 2022 11:40:01 GMT
- Title: Universal Representations: A Unified Look at Multiple Task and Domain
Learning
- Authors: Wei-Hong Li, Xialei Liu, Hakan Bilen
- Abstract summary: We propose a unified look at jointly learning multiple vision tasks and visual domains through universal representations.
We show that universal representations achieve state-of-the-art performances in learning of multiple dense prediction problems.
We also conduct multiple analysis through ablation and qualitative studies.
- Score: 37.27708297562079
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a unified look at jointly learning multiple vision tasks and
visual domains through universal representations, a single deep neural network.
Learning multiple problems simultaneously involves minimizing a weighted sum of
multiple loss functions with different magnitudes and characteristics and thus
results in unbalanced state of one loss dominating the optimization and poor
results compared to learning a separate model for each problem. To this end, we
propose distilling knowledge of multiple task/domain-specific networks into a
single deep neural network after aligning its representations with the
task/domain-specific ones through small capacity adapters. We rigorously show
that universal representations achieve state-of-the-art performances in
learning of multiple dense prediction problems in NYU-v2 and Cityscapes,
multiple image classification problems from diverse domains in Visual Decathlon
Dataset and cross-domain few-shot learning in MetaDataset. Finally we also
conduct multiple analysis through ablation and qualitative studies.
Related papers
- A Multitask Deep Learning Model for Classification and Regression of Hyperspectral Images: Application to the large-scale dataset [44.94304541427113]
We propose a multitask deep learning model to perform multiple classification and regression tasks simultaneously on hyperspectral images.
We validated our approach on a large hyperspectral dataset called TAIGA.
A comprehensive qualitative and quantitative analysis of the results shows that the proposed method significantly outperforms other state-of-the-art methods.
arXiv Detail & Related papers (2024-07-23T11:14:54Z) - Multi-Scale and Multi-Layer Contrastive Learning for Domain Generalization [5.124256074746721]
We argue that the generalization ability of deep convolutional neural networks can be improved by taking advantage of multi-layer and multi-scaled representations of the network.
We introduce a framework that aims at improving domain generalization of image classifiers by combining both low-level and high-level features at multiple scales.
We show that our model is able to surpass the performance of previous DG methods and consistently produce competitive and state-of-the-art results in all datasets.
arXiv Detail & Related papers (2023-08-28T08:54:27Z) - Capsules as viewpoint learners for human pose estimation [4.246061945756033]
We show how most neural networks are not able to generalize well when the camera is subject to significant viewpoint changes.
We propose a novel end-to-end viewpoint-equivariant capsule autoencoder that employs a fast Variational Bayes routing and matrix capsules.
We achieve state-of-the-art results for multiple tasks and datasets while retaining other desirable properties.
arXiv Detail & Related papers (2023-02-13T09:01:46Z) - Multi-View representation learning in Multi-Task Scene [4.509968166110557]
We propose a novel semi-supervised algorithm, termed as Multi-Task Multi-View learning based on Common and Special Features (MTMVCSF)
An anti-noise multi-task multi-view algorithm called AN-MTMVCSF is proposed, which has a strong adaptability to noise labels.
The effectiveness of these algorithms is proved by a series of well-designed experiments on both real world and synthetic data.
arXiv Detail & Related papers (2022-01-15T11:26:28Z) - Exploring Data Aggregation and Transformations to Generalize across
Visual Domains [0.0]
This thesis contributes to research on Domain Generalization (DG), Domain Adaptation (DA) and their variations.
We propose new frameworks for Domain Generalization and Domain Adaptation which make use of feature aggregation strategies and visual transformations.
We show how our proposed solutions outperform competitive state-of-the-art approaches in established DG and DA benchmarks.
arXiv Detail & Related papers (2021-08-20T14:58:14Z) - Learning distinct features helps, provably [98.78384185493624]
We study the diversity of the features learned by a two-layer neural network trained with the least squares loss.
We measure the diversity by the average $L$-distance between the hidden-layer features.
arXiv Detail & Related papers (2021-06-10T19:14:45Z) - Joint Learning of Neural Transfer and Architecture Adaptation for Image
Recognition [77.95361323613147]
Current state-of-the-art visual recognition systems rely on pretraining a neural network on a large-scale dataset and finetuning the network weights on a smaller dataset.
In this work, we prove that dynamically adapting network architectures tailored for each domain task along with weight finetuning benefits in both efficiency and effectiveness.
Our method can be easily generalized to an unsupervised paradigm by replacing supernet training with self-supervised learning in the source domain tasks and performing linear evaluation in the downstream tasks.
arXiv Detail & Related papers (2021-03-31T08:15:17Z) - Distribution Alignment: A Unified Framework for Long-tail Visual
Recognition [52.36728157779307]
We propose a unified distribution alignment strategy for long-tail visual recognition.
We then introduce a generalized re-weight method in the two-stage learning to balance the class prior.
Our approach achieves the state-of-the-art results across all four recognition tasks with a simple and unified framework.
arXiv Detail & Related papers (2021-03-30T14:09:53Z) - Deep Partial Multi-View Learning [94.39367390062831]
We propose a novel framework termed Cross Partial Multi-View Networks (CPM-Nets)
We fifirst provide a formal defifinition of completeness and versatility for multi-view representation.
We then theoretically prove the versatility of the learned latent representations.
arXiv Detail & Related papers (2020-11-12T02:29:29Z) - Multi-Task Learning for Dense Prediction Tasks: A Survey [87.66280582034838]
Multi-task learning (MTL) techniques have shown promising results w.r.t. performance, computations and/or memory footprint.
We provide a well-rounded view on state-of-the-art deep learning approaches for MTL in computer vision.
arXiv Detail & Related papers (2020-04-28T09:15:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.