Learning the Redundancy-free Features for Generalized Zero-Shot Object
Recognition
- URL: http://arxiv.org/abs/2006.08939v2
- Date: Sun, 23 May 2021 06:09:22 GMT
- Title: Learning the Redundancy-free Features for Generalized Zero-Shot Object
Recognition
- Authors: Zongyan Han, Zhenyong Fu and Jian Yang
- Abstract summary: Zero-shot object recognition aims to transfer the object recognition ability among the semantically related categories.
In this paper, we learn the redundancy-free features for generalized zero-shot learning.
The results show that our redundancy-free feature based generalized zero-shot learning (RFF-GZSL) approach can achieve competitive results compared with the state-of-the-arts.
- Score: 28.08885682748527
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot object recognition or zero-shot learning aims to transfer the
object recognition ability among the semantically related categories, such as
fine-grained animal or bird species. However, the images of different
fine-grained objects tend to merely exhibit subtle differences in appearance,
which will severely deteriorate zero-shot object recognition. To reduce the
superfluous information in the fine-grained objects, in this paper, we propose
to learn the redundancy-free features for generalized zero-shot learning. We
achieve our motivation by projecting the original visual features into a new
(redundancy-free) feature space and then restricting the statistical dependence
between these two feature spaces. Furthermore, we require the projected
features to keep and even strengthen the category relationship in the
redundancy-free feature space. In this way, we can remove the redundant
information from the visual features without losing the discriminative
information. We extensively evaluate the performance on four benchmark
datasets. The results show that our redundancy-free feature based generalized
zero-shot learning (RFF-GZSL) approach can achieve competitive results compared
with the state-of-the-arts.
Related papers
- High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning [54.86882315023791]
We propose an innovative approach called High-Discriminative Attribute Feature Learning for Generalized Zero-Shot Learning (HDAFL)
HDAFL utilizes multiple convolutional kernels to automatically learn discriminative regions highly correlated with attributes in images.
We also introduce a Transformer-based attribute discrimination encoder to enhance the discriminative capability among attributes.
arXiv Detail & Related papers (2024-04-07T13:17:47Z) - Exploiting Semantic Attributes for Transductive Zero-Shot Learning [97.61371730534258]
Zero-shot learning aims to recognize unseen classes by generalizing the relation between visual features and semantic attributes learned from the seen classes.
We present a novel transductive ZSL method that produces semantic attributes of the unseen data and imposes them on the generative process.
Experiments on five standard benchmarks show that our method yields state-of-the-art results for zero-shot learning.
arXiv Detail & Related papers (2023-03-17T09:09:48Z) - Self-Supervised Visual Representation Learning with Semantic Grouping [50.14703605659837]
We tackle the problem of learning visual representations from unlabeled scene-centric data.
We propose contrastive learning from data-driven semantic slots, namely SlotCon, for joint semantic grouping and representation learning.
arXiv Detail & Related papers (2022-05-30T17:50:59Z) - Cross-modal Representation Learning for Zero-shot Action Recognition [67.57406812235767]
We present a cross-modal Transformer-based framework, which jointly encodes video data and text labels for zero-shot action recognition (ZSAR)
Our model employs a conceptually new pipeline by which visual representations are learned in conjunction with visual-semantic associations in an end-to-end manner.
Experiment results show our model considerably improves upon the state of the arts in ZSAR, reaching encouraging top-1 accuracy on UCF101, HMDB51, and ActivityNet benchmark datasets.
arXiv Detail & Related papers (2022-05-03T17:39:27Z) - VGSE: Visually-Grounded Semantic Embeddings for Zero-Shot Learning [113.50220968583353]
We propose to discover semantic embeddings containing discriminative visual properties for zero-shot learning.
Our model visually divides a set of images from seen classes into clusters of local image regions according to their visual similarity.
We demonstrate that our visually-grounded semantic embeddings further improve performance over word embeddings across various ZSL models by a large margin.
arXiv Detail & Related papers (2022-03-20T03:49:02Z) - Towards Self-Supervised Learning of Global and Object-Centric
Representations [4.36572039512405]
We discuss key aspects of learning structured object-centric representations with self-supervision.
We validate our insights through several experiments on the CLEVR dataset.
arXiv Detail & Related papers (2022-03-11T15:18:47Z) - Intriguing Properties of Contrastive Losses [12.953112189125411]
We study three intriguing properties of contrastive learning.
We study if instance-based contrastive learning can learn well on images with multiple objects present.
We show that, for contrastive learning, a few bits of easy-to-learn shared features can suppress, and even fully prevent, the learning of other sets of competing features.
arXiv Detail & Related papers (2020-11-05T13:19:48Z) - Synthesizing the Unseen for Zero-shot Object Detection [72.38031440014463]
We propose to synthesize visual features for unseen classes, so that the model learns both seen and unseen objects in the visual domain.
We use a novel generative model that uses class-semantics to not only generate the features but also to discriminatively separate them.
arXiv Detail & Related papers (2020-10-19T12:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.