Learning Prototype via Placeholder for Zero-shot Recognition
- URL: http://arxiv.org/abs/2207.14581v1
- Date: Fri, 29 Jul 2022 09:56:44 GMT
- Title: Learning Prototype via Placeholder for Zero-shot Recognition
- Authors: Zaiquan Yang, Yang Liu, Wenjia Xu, Chong Huang, Lei Zhou, Chao Tong
- Abstract summary: We propose to learn prototypes via placeholders, termed LPL, to eliminate the domain shift between seen and unseen classes.
We exploit a novel semantic-oriented fine-tuning to guarantee the semantic reliability of placeholders.
Experiments on five benchmark datasets demonstrate the significant performance gain of LPL over the state-of-the-art methods.
- Score: 18.204927316433448
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot learning (ZSL) aims to recognize unseen classes by exploiting
semantic descriptions shared between seen classes and unseen classes. Current
methods show that it is effective to learn visual-semantic alignment by
projecting semantic embeddings into the visual space as class prototypes.
However, such a projection function is only concerned with seen classes. When
applied to unseen classes, the prototypes often perform suboptimally due to
domain shift. In this paper, we propose to learn prototypes via placeholders,
termed LPL, to eliminate the domain shift between seen and unseen classes.
Specifically, we combine seen classes to hallucinate new classes which play as
placeholders of the unseen classes in the visual and semantic space. Placed
between seen classes, the placeholders encourage prototypes of seen classes to
be highly dispersed. And more space is spared for the insertion of
well-separated unseen ones. Empirically, well-separated prototypes help
counteract visual-semantic misalignment caused by domain shift. Furthermore, we
exploit a novel semantic-oriented fine-tuning to guarantee the semantic
reliability of placeholders. Extensive experiments on five benchmark datasets
demonstrate the significant performance gain of LPL over the state-of-the-art
methods. Code is available at https://github.com/zaiquanyang/LPL.
Related papers
- Language-Driven Visual Consensus for Zero-Shot Semantic Segmentation [114.72734384299476]
We propose a Language-Driven Visual Consensus (LDVC) approach, fostering improved alignment of semantic and visual information.
We leverage class embeddings as anchors due to their discrete and abstract nature, steering vision features toward class embeddings.
Our approach significantly boosts the capacity of segmentation models for unseen classes.
arXiv Detail & Related papers (2024-03-13T11:23:55Z) - Hunting Attributes: Context Prototype-Aware Learning for Weakly
Supervised Semantic Segmentation [22.591512454923883]
We argue that the knowledge bias between instances and contexts affects the capability of the prototype to sufficiently understand instance semantics.
Inspired by prototype learning theory, we propose leveraging prototype awareness to capture diverse and fine-grained feature attributes of instances.
We present a Context Prototype-Aware Learning (CPAL) strategy, which leverages semantic context to enrich instance comprehension.
arXiv Detail & Related papers (2024-03-12T13:11:58Z) - Learning Semantic Ambiguities for Zero-Shot Learning [0.0]
We propose a regularization method that can be applied to any conditional generative-based ZSL method.
It learns to synthesize discriminative features for possible semantic description that are not available at training time, that is the unseen ones.
The approach is evaluated for ZSL and GZSL on four datasets commonly used in the literature.
arXiv Detail & Related papers (2022-01-05T21:08:29Z) - Dual Prototypical Contrastive Learning for Few-shot Semantic
Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task.
The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space.
We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z) - SEGA: Semantic Guided Attention on Visual Prototype for Few-Shot
Learning [85.2093650907943]
We propose SEmantic Guided Attention (SEGA) to teach machines to recognize a new category.
SEGA uses semantic knowledge to guide the visual perception in a top-down manner about what visual features should be paid attention to.
We show that our semantic guided attention realizes anticipated function and outperforms state-of-the-art results.
arXiv Detail & Related papers (2021-11-08T08:03:44Z) - Rich Semantics Improve Few-shot Learning [49.11659525563236]
We show that by using 'class-level' language descriptions, that can be acquired with minimal annotation cost, we can improve the few-shot learning performance.
We develop a Transformer based forward and backward encoding mechanism to relate visual and semantic tokens.
arXiv Detail & Related papers (2021-04-26T16:48:27Z) - Learning Robust Visual-semantic Mapping for Zero-shot Learning [8.299945169799795]
We focus on fully empowering the semantic feature space, which is one of the key building blocks of Zero-shot learning (ZSL)
In ZSL, the common practice is to train a mapping function between the visual and semantic feature spaces with labeled seen class examples.
Under such a paradigm, the ZSL models may easily suffer from the domain shift problem when constructing and reusing the mapping function.
arXiv Detail & Related papers (2021-04-12T17:39:38Z) - Isometric Propagation Network for Generalized Zero-shot Learning [72.02404519815663]
A popular strategy is to learn a mapping between the semantic space of class attributes and the visual space of images based on the seen classes and their data.
We propose Isometric propagation Network (IPN), which learns to strengthen the relation between classes within each space and align the class dependency in the two spaces.
IPN achieves state-of-the-art performance on three popular Zero-shot learning benchmarks.
arXiv Detail & Related papers (2021-02-03T12:45:38Z) - Information Bottleneck Constrained Latent Bidirectional Embedding for
Zero-Shot Learning [59.58381904522967]
We propose a novel embedding based generative model with a tight visual-semantic coupling constraint.
We learn a unified latent space that calibrates the embedded parametric distributions of both visual and semantic spaces.
Our method can be easily extended to transductive ZSL setting by generating labels for unseen images.
arXiv Detail & Related papers (2020-09-16T03:54:12Z) - A Novel Perspective to Zero-shot Learning: Towards an Alignment of
Manifold Structures via Semantic Feature Expansion [17.48923061278128]
A common practice in zero-shot learning is to train a projection between the visual and semantic feature spaces with labeled seen classes examples.
Under such a paradigm, most existing methods easily suffer from the domain shift problem and weaken the performance of zero-shot recognition.
We propose a novel model called AMS-SFE that considers the alignment of manifold structures by semantic feature expansion.
arXiv Detail & Related papers (2020-04-30T14:08:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.