ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for
Open-Vocabulary Object Detection
- URL: http://arxiv.org/abs/2312.07266v4
- Date: Tue, 20 Feb 2024 22:57:58 GMT
- Title: ProxyDet: Synthesizing Proxy Novel Classes via Classwise Mixup for
Open-Vocabulary Object Detection
- Authors: Joonhyun Jeong, Geondo Park, Jayeon Yoo, Hyungsik Jung, Heesu Kim
- Abstract summary: Open-vocabulary object detection (OVOD) aims to recognize novel objects whose categories are not included in the training set.
We present a novel, yet simple technique that helps generalization on the overall distribution of novel classes.
- Score: 7.122652901894367
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Open-vocabulary object detection (OVOD) aims to recognize novel objects whose
categories are not included in the training set. In order to classify these
unseen classes during training, many OVOD frameworks leverage the zero-shot
capability of largely pretrained vision and language models, such as CLIP. To
further improve generalization on the unseen novel classes, several approaches
proposed to additionally train with pseudo region labeling on the external data
sources that contain a substantial number of novel category labels beyond the
existing training data. Albeit its simplicity, these pseudo-labeling methods
still exhibit limited improvement with regard to the truly unseen novel classes
that were not pseudo-labeled. In this paper, we present a novel, yet simple
technique that helps generalization on the overall distribution of novel
classes. Inspired by our observation that numerous novel classes reside within
the convex hull constructed by the base (seen) classes in the CLIP embedding
space, we propose to synthesize proxy-novel classes approximating novel classes
via linear mixup between a pair of base classes. By training our detector with
these synthetic proxy-novel classes, we effectively explore the embedding space
of novel classes. The experimental results on various OVOD benchmarks such as
LVIS and COCO demonstrate superior performance on novel classes compared to the
other state-of-the-art methods. Code is available at
https://github.com/clovaai/ProxyDet.
Related papers
- Semantic Enhanced Few-shot Object Detection [37.715912401900745]
We propose a fine-tuning based FSOD framework that utilizes semantic embeddings for better detection.
Our method allows each novel class to construct a compact feature space without being confused with similar base classes.
arXiv Detail & Related papers (2024-06-19T12:40:55Z) - Organizing Background to Explore Latent Classes for Incremental Few-shot Semantic Segmentation [7.570798966278471]
incremental Few-shot Semantic COCO (iFSS) is to extend pre-trained segmentation models to new classes via few annotated images.
We propose a network called OINet, i.e., the background embedding space textbfOrganization and prototype textbfInherit Network.
arXiv Detail & Related papers (2024-05-29T23:22:12Z) - Open-Vocabulary Object Detection with Meta Prompt Representation and Instance Contrastive Optimization [63.66349334291372]
We propose a framework with Meta prompt and Instance Contrastive learning (MIC) schemes.
Firstly, we simulate a novel-class-emerging scenario to help the prompt that learns class and background prompts generalize to novel classes.
Secondly, we design an instance-level contrastive strategy to promote intra-class compactness and inter-class separation, which benefits generalization of the detector to novel class objects.
arXiv Detail & Related papers (2024-03-14T14:25:10Z) - Few-Shot Class-Incremental Learning via Training-Free Prototype
Calibration [67.69532794049445]
We find a tendency for existing methods to misclassify the samples of new classes into base classes, which leads to the poor performance of new classes.
We propose a simple yet effective Training-frEE calibratioN (TEEN) strategy to enhance the discriminability of new classes.
arXiv Detail & Related papers (2023-12-08T18:24:08Z) - Class-incremental Novel Class Discovery [76.35226130521758]
We study the new task of class-incremental Novel Class Discovery (class-iNCD)
We propose a novel approach for class-iNCD which prevents forgetting of past information about the base classes.
Our experiments, conducted on three common benchmarks, demonstrate that our method significantly outperforms state-of-the-art approaches.
arXiv Detail & Related papers (2022-07-18T13:49:27Z) - Demystifying the Base and Novel Performances for Few-shot
Class-incremental Learning [15.762281194023462]
Few-shot class-incremental learning (FSCIL) has addressed challenging real-world scenarios where unseen novel classes continually arrive with few samples.
It is required to develop a model that recognizes the novel classes without forgetting prior knowledge.
It is shown that our straightforward method has comparable performance with the sophisticated state-of-the-art algorithms.
arXiv Detail & Related papers (2022-06-18T00:39:47Z) - Few-Shot Object Detection via Association and DIscrimination [83.8472428718097]
Few-shot object detection via Association and DIscrimination builds up a discriminative feature space for each novel class with two integral steps.
Experiments on Pascal VOC and MS-COCO datasets demonstrate FADI achieves new SOTA performance, significantly improving the baseline in any shot/split by +18.7.
arXiv Detail & Related papers (2021-11-23T05:04:06Z) - Bridging Non Co-occurrence with Unlabeled In-the-wild Data for
Incremental Object Detection [56.22467011292147]
Several incremental learning methods are proposed to mitigate catastrophic forgetting for object detection.
Despite the effectiveness, these methods require co-occurrence of the unlabeled base classes in the training data of the novel classes.
We propose the use of unlabeled in-the-wild data to bridge the non-occurrence caused by the missing base classes during the training of additional novel classes.
arXiv Detail & Related papers (2021-10-28T10:57:25Z) - UniT: Unified Knowledge Transfer for Any-shot Object Detection and
Segmentation [52.487469544343305]
Methods for object detection and segmentation rely on large scale instance-level annotations for training.
We propose an intuitive and unified semi-supervised model that is applicable to a range of supervision.
arXiv Detail & Related papers (2020-06-12T22:45:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.