HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning
- URL: http://arxiv.org/abs/2109.15163v1
- Date: Thu, 30 Sep 2021 14:27:50 GMT
- Title: HSVA: Hierarchical Semantic-Visual Adaptation for Zero-Shot Learning
- Authors: Shiming Chen, Guo-Sen Xie, Qinmu Peng, Yang Liu, Baigui Sun, Hao Li,
Xinge You, Ling Shao
- Abstract summary: Zero-shot learning (ZSL) tackles the unseen class recognition problem, transferring semantic knowledge from seen classes to unseen ones.
We propose a novel hierarchical semantic-visual adaptation (HSVA) framework to align semantic and visual domains.
Experiments on four benchmark datasets demonstrate HSVA achieves superior performance on both conventional and generalized ZSL.
- Score: 74.76431541169342
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Zero-shot learning (ZSL) tackles the unseen class recognition problem,
transferring semantic knowledge from seen classes to unseen ones. Typically, to
guarantee desirable knowledge transfer, a common (latent) space is adopted for
associating the visual and semantic domains in ZSL. However, existing common
space learning methods align the semantic and visual domains by merely
mitigating distribution disagreement through one-step adaptation. This strategy
is usually ineffective due to the heterogeneous nature of the feature
representations in the two domains, which intrinsically contain both
distribution and structure variations. To address this and advance ZSL, we
propose a novel hierarchical semantic-visual adaptation (HSVA) framework.
Specifically, HSVA aligns the semantic and visual domains by adopting a
hierarchical two-step adaptation, i.e., structure adaptation and distribution
adaptation. In the structure adaptation step, we take two task-specific
encoders to encode the source data (visual domain) and the target data
(semantic domain) into a structure-aligned common space. To this end, a
supervised adversarial discrepancy (SAD) module is proposed to adversarially
minimize the discrepancy between the predictions of two task-specific
classifiers, thus making the visual and semantic feature manifolds more closely
aligned. In the distribution adaptation step, we directly minimize the
Wasserstein distance between the latent multivariate Gaussian distributions to
align the visual and semantic distributions using a common encoder. Finally,
the structure and distribution adaptation are derived in a unified framework
under two partially-aligned variational autoencoders. Extensive experiments on
four benchmark datasets demonstrate that HSVA achieves superior performance on
both conventional and generalized ZSL. The code is available at
\url{https://github.com/shiming-chen/HSVA} .
Related papers
- Generalize or Detect? Towards Robust Semantic Segmentation Under Multiple Distribution Shifts [56.57141696245328]
In open-world scenarios, where both novel classes and domains may exist, an ideal segmentation model should detect anomaly classes for safety.
Existing methods often struggle to distinguish between domain-level and semantic-level distribution shifts.
arXiv Detail & Related papers (2024-11-06T11:03:02Z) - Adaptive Betweenness Clustering for Semi-Supervised Domain Adaptation [108.40945109477886]
We propose a novel SSDA approach named Graph-based Adaptive Betweenness Clustering (G-ABC) for achieving categorical domain alignment.
Our method outperforms previous state-of-the-art SSDA approaches, demonstrating the superiority of the proposed G-ABC algorithm.
arXiv Detail & Related papers (2024-01-21T09:57:56Z) - Bi-directional Distribution Alignment for Transductive Zero-Shot
Learning [48.80413182126543]
We propose a novel zero-shot learning model (TZSL) called Bi-VAEGAN.
It largely improves the shift by a strengthened distribution alignment between the visual and auxiliary spaces.
In benchmark evaluation, Bi-VAEGAN achieves the new state of the arts under both the standard and generalized TZSL settings.
arXiv Detail & Related papers (2023-03-15T15:32:59Z) - Distribution Regularized Self-Supervised Learning for Domain Adaptation
of Semantic Segmentation [3.284878354988896]
This paper proposes a pixel-level distribution regularization scheme (DRSL) for self-supervised domain adaptation of semantic segmentation.
In a typical setting, the classification loss forces the semantic segmentation model to greedily learn the representations that capture inter-class variations.
We capture pixel-level intra-class variations through class-aware multi-modal distribution learning.
arXiv Detail & Related papers (2022-06-20T09:52:49Z) - Semi-supervised Domain Adaptive Structure Learning [72.01544419893628]
Semi-supervised domain adaptation (SSDA) is a challenging problem requiring methods to overcome both 1) overfitting towards poorly annotated data and 2) distribution shift across domains.
We introduce an adaptive structure learning method to regularize the cooperation of SSL and DA.
arXiv Detail & Related papers (2021-12-12T06:11:16Z) - Semi-supervised Domain Adaptation for Semantic Segmentation [3.946367634483361]
We propose a novel two-step semi-supervised dual-domain adaptation (SSDDA) approach to address both cross- and intra-domain gaps in semantic segmentation.
We demonstrate that the proposed approach outperforms state-of-the-art methods on two common synthetic-to-real semantic segmentation benchmarks.
arXiv Detail & Related papers (2021-10-20T16:13:00Z) - Unsupervised Domain Adaptation for Semantic Segmentation via Low-level
Edge Information Transfer [27.64947077788111]
Unsupervised domain adaptation for semantic segmentation aims to make models trained on synthetic data adapt to real images.
Previous feature-level adversarial learning methods only consider adapting models on the high-level semantic features.
We present the first attempt at explicitly using low-level edge information, which has a small inter-domain gap, to guide the transfer of semantic information.
arXiv Detail & Related papers (2021-09-18T11:51:31Z) - Contextual-Relation Consistent Domain Adaptation for Semantic
Segmentation [44.19436340246248]
This paper presents an innovative local contextual-relation consistent domain adaptation technique.
It aims to achieve local-level consistencies during the global-level alignment.
Experiments demonstrate its superior segmentation performance as compared with state-of-the-art methods.
arXiv Detail & Related papers (2020-07-05T19:00:46Z) - Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation
Method for Semantic Segmentation [97.8552697905657]
A key challenge of this task is how to alleviate the data distribution discrepancy between the source and target domains.
We propose Alleviating Semantic-level Shift (ASS), which can successfully promote the distribution consistency from both global and local views.
We apply our ASS to two domain adaptation tasks, from GTA5 to Cityscapes and from Synthia to Cityscapes.
arXiv Detail & Related papers (2020-04-02T03:25:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.