Towards Recognizing New Semantic Concepts in New Visual Domains
- URL: http://arxiv.org/abs/2012.09058v1
- Date: Wed, 16 Dec 2020 16:23:40 GMT
- Title: Towards Recognizing New Semantic Concepts in New Visual Domains
- Authors: Massimiliano Mancini
- Abstract summary: We argue that it is crucial to design deep architectures that can operate in previously unseen visual domains and recognize novel semantic concepts.
In the first part of the thesis, we describe different solutions to enable deep models to generalize to new visual domains.
In the second part, we show how to extend the knowledge of a pretrained deep model to new semantic concepts, without access to the original training set.
- Score: 9.701036831490768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning models heavily rely on large scale annotated datasets for
training. Unfortunately, datasets cannot capture the infinite variability of
the real world, thus neural networks are inherently limited by the restricted
visual and semantic information contained in their training set. In this
thesis, we argue that it is crucial to design deep architectures that can
operate in previously unseen visual domains and recognize novel semantic
concepts. In the first part of the thesis, we describe different solutions to
enable deep models to generalize to new visual domains, by transferring
knowledge from a labeled source domain(s) to a domain (target) where no labeled
data are available. We will show how variants of batch-normalization (BN) can
be applied to different scenarios, from domain adaptation when source and
target are mixtures of multiple latent domains, to domain generalization,
continuous domain adaptation, and predictive domain adaptation, where
information about the target domain is available only in the form of metadata.
In the second part of the thesis, we show how to extend the knowledge of a
pretrained deep model to new semantic concepts, without access to the original
training set. We address the scenarios of sequential multi-task learning, using
transformed task-specific binary masks, open-world recognition, with end-to-end
training and enforced clustering, and incremental class learning in semantic
segmentation, where we highlight and address the problem of the semantic shift
of the background class. In the final part, we tackle a more challenging
problem: given images of multiple domains and semantic categories (with their
attributes), how to build a model that recognizes images of unseen concepts in
unseen domains? We also propose an approach based on domain and semantic mixing
of inputs and features, which is a first, promising step towards solving this
problem.
Related papers
- Learning to Generalize Unseen Domains via Multi-Source Meta Learning for Text Classification [71.08024880298613]
We study the multi-source Domain Generalization of text classification.
We propose a framework to use multiple seen domains to train a model that can achieve high accuracy in an unseen domain.
arXiv Detail & Related papers (2024-09-20T07:46:21Z) - Compositional Semantic Mix for Domain Adaptation in Point Cloud
Segmentation [65.78246406460305]
compositional semantic mixing represents the first unsupervised domain adaptation technique for point cloud segmentation.
We present a two-branch symmetric network architecture capable of concurrently processing point clouds from a source domain (e.g. synthetic) and point clouds from a target domain (e.g. real-world)
arXiv Detail & Related papers (2023-08-28T14:43:36Z) - Domain-invariant Prototypes for Semantic Segmentation [30.932130453313537]
We present an easy-to-train framework that learns domain-invariant prototypes for domain adaptive semantic segmentation.
Our method involves only one-stage training and does not need to be trained on large-scale un-annotated target images.
arXiv Detail & Related papers (2022-08-12T02:21:05Z) - Few-Shot Object Detection in Unseen Domains [4.36080478413575]
Few-shot object detection (FSOD) has thrived in recent years to learn novel object classes with limited data.
We propose various data augmentations techniques on the few shots of novel classes to account for all possible domain-specific information.
Our experiments on the T-LESS dataset show that the proposed approach succeeds in alleviating the domain gap considerably.
arXiv Detail & Related papers (2022-04-11T13:16:41Z) - Structured Latent Embeddings for Recognizing Unseen Classes in Unseen
Domains [108.11746235308046]
We propose a novel approach that learns domain-agnostic structured latent embeddings by projecting images from different domains.
Our experiments on the challenging DomainNet and DomainNet-LS benchmarks show the superiority of our approach over existing methods.
arXiv Detail & Related papers (2021-07-12T17:57:46Z) - A Review of Single-Source Deep Unsupervised Visual Domain Adaptation [81.07994783143533]
Large-scale labeled training datasets have enabled deep neural networks to excel across a wide range of benchmark vision tasks.
In many applications, it is prohibitively expensive and time-consuming to obtain large quantities of labeled data.
To cope with limited labeled training data, many have attempted to directly apply models trained on a large-scale labeled source domain to another sparsely labeled or unlabeled target domain.
arXiv Detail & Related papers (2020-09-01T00:06:50Z) - Domain Adaptation for Semantic Parsing [68.81787666086554]
We propose a novel semantic for domain adaptation, where we have much fewer annotated data in the target domain compared to the source domain.
Our semantic benefits from a two-stage coarse-to-fine framework, thus can provide different and accurate treatments for the two stages.
Experiments on a benchmark dataset show that our method consistently outperforms several popular domain adaptation strategies.
arXiv Detail & Related papers (2020-06-23T14:47:41Z) - Spatial Attention Pyramid Network for Unsupervised Domain Adaptation [66.75008386980869]
Unsupervised domain adaptation is critical in various computer vision tasks.
We design a new spatial attention pyramid network for unsupervised domain adaptation.
Our method performs favorably against the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2020-03-29T09:03:23Z) - Learning to adapt class-specific features across domains for semantic
segmentation [36.36210909649728]
In this thesis, we present a novel architecture, which learns to adapt features across domains by taking into account per class information.
We adopt the recently introduced StarGAN architecture as image translation backbone, since it is able to perform translations across multiple domains by means of a single generator network.
arXiv Detail & Related papers (2020-01-22T23:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.