Resolving Semantic Confusions for Improved Zero-Shot Detection
- URL: http://arxiv.org/abs/2212.06097v1
- Date: Mon, 12 Dec 2022 18:11:48 GMT
- Title: Resolving Semantic Confusions for Improved Zero-Shot Detection
- Authors: Sandipan Sarma, Sushil Kumar, Arijit Sur
- Abstract summary: We propose a generative model incorporating a triplet loss that acknowledges the degree of dissimilarity between classes.
A cyclic-consistency loss is also enforced to ensure that generated visual samples of a class highly correspond to their own semantics.
- Score: 6.72910827751713
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Zero-shot detection (ZSD) is a challenging task where we aim to recognize and
localize objects simultaneously, even when our model has not been trained with
visual samples of a few target ("unseen") classes. Recently, methods employing
generative models like GANs have shown some of the best results, where
unseen-class samples are generated based on their semantics by a GAN trained on
seen-class data, enabling vanilla object detectors to recognize unseen objects.
However, the problem of semantic confusion still remains, where the model is
sometimes unable to distinguish between semantically-similar classes. In this
work, we propose to train a generative model incorporating a triplet loss that
acknowledges the degree of dissimilarity between classes and reflects them in
the generated samples. Moreover, a cyclic-consistency loss is also enforced to
ensure that generated visual samples of a class highly correspond to their own
semantics. Extensive experiments on two benchmark ZSD datasets - MSCOCO and
PASCAL-VOC - demonstrate significant gains over the current ZSD methods,
reducing semantic confusion and improving detection for the unseen classes.
Related papers
- Zero-Shot Temporal Action Detection via Vision-Language Prompting [134.26292288193298]
We propose a novel zero-Shot Temporal Action detection model via Vision-LanguagE prompting (STALE)
Our model significantly outperforms state-of-the-art alternatives.
Our model also yields superior results on supervised TAD over recent strong competitors.
arXiv Detail & Related papers (2022-07-17T13:59:46Z) - GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot
Learning [55.79997930181418]
Generalized Zero-Shot Learning aims to recognize images from both the seen and unseen classes by transferring semantic knowledge from seen to unseen classes.
It is a promising solution to take the advantage of generative models to hallucinate realistic unseen samples based on the knowledge learned from the seen classes.
We propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
arXiv Detail & Related papers (2022-07-05T04:04:37Z) - DUET: Cross-modal Semantic Grounding for Contrastive Zero-shot Learning [37.48292304239107]
We present a transformer-based end-to-end ZSL method named DUET.
We develop a cross-modal semantic grounding network to investigate the model's capability of disentangling semantic attributes from the images.
We find that DUET can often achieve state-of-the-art performance, its components are effective and its predictions are interpretable.
arXiv Detail & Related papers (2022-07-04T11:12:12Z) - Bias-Eliminated Semantic Refinement for Any-Shot Learning [27.374052527155623]
We refine the coarse-grained semantic description for any-shot learning tasks.
A new model, namely, the semantic refinement Wasserstein generative adversarial network (SRWGAN) model, is designed.
We extensively evaluate model performance on six benchmark datasets.
arXiv Detail & Related papers (2022-02-10T04:15:50Z) - Semantics-Guided Contrastive Network for Zero-Shot Object detection [67.61512036994458]
Zero-shot object detection (ZSD) is a new challenge in computer vision.
We develop ContrastZSD, a framework that brings contrastive learning mechanism into the realm of zero-shot detection.
Our method outperforms the previous state-of-the-art on both ZSD and generalized ZSD tasks.
arXiv Detail & Related papers (2021-09-04T03:32:15Z) - Neighborhood Contrastive Learning for Novel Class Discovery [79.14767688903028]
We build a new framework, named Neighborhood Contrastive Learning, to learn discriminative representations that are important to clustering performance.
We experimentally demonstrate that these two ingredients significantly contribute to clustering performance and lead our model to outperform state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2021-06-20T17:34:55Z) - Incrementally Zero-Shot Detection by an Extreme Value Analyzer [0.0]
This paper introduces a novel strategy for both zero-shot learning and class-incremental learning in real-world object detection.
We propose a novel extreme value analyzer to detect objects from old seen, new seen, and unseen classes, simultaneously.
Experiments demonstrate the efficacy of our model in detecting objects from both the seen and unseen classes, outperforming the alternative models on Pascal VOC and MSCOCO datasets.
arXiv Detail & Related papers (2021-03-23T15:06:30Z) - Entropy-Based Uncertainty Calibration for Generalized Zero-Shot Learning [49.04790688256481]
The goal of generalized zero-shot learning (GZSL) is to recognise both seen and unseen classes.
Most GZSL methods typically learn to synthesise visual representations from semantic information on the unseen classes.
We propose a novel framework that leverages dual variational autoencoders with a triplet loss to learn discriminative latent features.
arXiv Detail & Related papers (2021-01-09T05:21:27Z) - Synthesizing the Unseen for Zero-shot Object Detection [72.38031440014463]
We propose to synthesize visual features for unseen classes, so that the model learns both seen and unseen objects in the visual domain.
We use a novel generative model that uses class-semantics to not only generate the features but also to discriminatively separate them.
arXiv Detail & Related papers (2020-10-19T12:36:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.