Meta-ZSDETR: Zero-shot DETR with Meta-learning
- URL: http://arxiv.org/abs/2308.09540v1
- Date: Fri, 18 Aug 2023 13:17:07 GMT
- Title: Meta-ZSDETR: Zero-shot DETR with Meta-learning
- Authors: Lu Zhang, Chenbo Zhang, Jiajia Zhao, Jihong Guan, Shuigeng Zhou
- Abstract summary: We present the first method that combines DETR and meta-learning to perform zero-shot object detection, named Meta-ZSDETR.
The model is optimized with meta-contrastive learning, which contains a regression head to generate the coordinates of class-specific boxes.
Experimental results show that our method outperforms the existing ZSD methods by a large margin.
- Score: 29.58827207505671
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-shot object detection aims to localize and recognize objects of unseen
classes. Most of existing works face two problems: the low recall of RPN in
unseen classes and the confusion of unseen classes with background. In this
paper, we present the first method that combines DETR and meta-learning to
perform zero-shot object detection, named Meta-ZSDETR, where model training is
formalized as an individual episode based meta-learning task. Different from
Faster R-CNN based methods that firstly generate class-agnostic proposals, and
then classify them with visual-semantic alignment module, Meta-ZSDETR directly
predict class-specific boxes with class-specific queries and further filter
them with the predicted accuracy from classification head. The model is
optimized with meta-contrastive learning, which contains a regression head to
generate the coordinates of class-specific boxes, a classification head to
predict the accuracy of generated boxes, and a contrastive head that utilizes
the proposed contrastive-reconstruction loss to further separate different
classes in visual space. We conduct extensive experiments on two benchmark
datasets MS COCO and PASCAL VOC. Experimental results show that our method
outperforms the existing ZSD methods by a large margin.
Related papers
- Learning Classifiers of Prototypes and Reciprocal Points for Universal
Domain Adaptation [79.62038105814658]
Universal Domain aims to transfer the knowledge between datasets by handling two shifts: domain-shift and categoryshift.
Main challenge is correctly distinguishing the unknown target samples while adapting the distribution of known class knowledge from source to target.
Most existing methods approach this problem by first training the target adapted known and then relying on the single threshold to distinguish unknown target samples.
arXiv Detail & Related papers (2022-12-16T09:01:57Z) - Meta-DETR: Image-Level Few-Shot Detection with Inter-Class Correlation
Exploitation [100.87407396364137]
We design Meta-DETR, which (i) is the first image-level few-shot detector, and (ii) introduces a novel inter-class correlational meta-learning strategy.
Experiments over multiple few-shot object detection benchmarks show that the proposed Meta-DETR outperforms state-of-the-art methods by large margins.
arXiv Detail & Related papers (2022-07-30T13:46:07Z) - Learning What Not to Segment: A New Perspective on Few-Shot Segmentation [63.910211095033596]
Recently few-shot segmentation (FSS) has been extensively developed.
This paper proposes a fresh and straightforward insight to alleviate the problem.
In light of the unique nature of the proposed approach, we also extend it to a more realistic but challenging setting.
arXiv Detail & Related papers (2022-03-15T03:08:27Z) - CAR: Class-aware Regularizations for Semantic Segmentation [20.947897583427192]
We propose a universal Class-Aware Regularization (CAR) approach to optimize the intra-class variance and inter-class distance during feature learning.
Our method can be easily applied to most existing segmentation models during training, including OCR and CPNet.
arXiv Detail & Related papers (2022-03-14T15:02:48Z) - A Gating Model for Bias Calibration in Generalized Zero-shot Learning [18.32369721322249]
Generalized zero-shot learning (GZSL) aims at training a model that can generalize to unseen class data by only using auxiliary information.
One of the main challenges in GZSL is a biased model prediction toward seen classes caused by overfitting on only available seen class data during training.
We propose a two-stream autoencoder-based gating model for GZSL.
arXiv Detail & Related papers (2022-03-08T16:41:06Z) - Rank4Class: A Ranking Formulation for Multiclass Classification [26.47229268790206]
Multiclass classification (MCC) is a fundamental machine learning problem.
We show that it is easy to boost MCC performance with a novel formulation through the lens of ranking.
arXiv Detail & Related papers (2021-12-17T19:22:37Z) - Self-Supervised Class Incremental Learning [51.62542103481908]
Existing Class Incremental Learning (CIL) methods are based on a supervised classification framework sensitive to data labels.
When updating them based on the new class data, they suffer from catastrophic forgetting: the model cannot discern old class data clearly from the new.
In this paper, we explore the performance of Self-Supervised representation learning in Class Incremental Learning (SSCIL) for the first time.
arXiv Detail & Related papers (2021-11-18T06:58:19Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Meta Faster R-CNN: Towards Accurate Few-Shot Object Detection with
Attentive Feature Alignment [33.446875089255876]
Few-shot object detection (FSOD) aims to detect objects using only few examples.
We propose a meta-learning based few-shot object detection method by transferring meta-knowledge learned from data-abundant base classes to data-scarce novel classes.
arXiv Detail & Related papers (2021-04-15T19:01:27Z) - Meta-DETR: Few-Shot Object Detection via Unified Image-Level
Meta-Learning [39.50529982746885]
Few-shot object detection aims at detecting novel objects with only a few annotated examples.
This paper presents a novel meta-detector framework, namely Meta-DETR, which eliminates region-wise prediction.
It instead meta-learns object localization and classification at image level in a unified and complementary manner.
arXiv Detail & Related papers (2021-03-22T11:14:00Z) - Few-shot Action Recognition with Prototype-centered Attentive Learning [88.10852114988829]
Prototype-centered Attentive Learning (PAL) model composed of two novel components.
First, a prototype-centered contrastive learning loss is introduced to complement the conventional query-centered learning objective.
Second, PAL integrates a attentive hybrid learning mechanism that can minimize the negative impacts of outliers.
arXiv Detail & Related papers (2021-01-20T11:48:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.