Few-Shot Object Detection with Sparse Context Transformers
- URL: http://arxiv.org/abs/2402.09315v1
- Date: Wed, 14 Feb 2024 17:10:01 GMT
- Title: Few-Shot Object Detection with Sparse Context Transformers
- Authors: Jie Mei, Mingyuan Jiu, Hichem Sahbi, Xiaoheng Jiang, Mingliang Xu
- Abstract summary: Few-shot detection is a major task in pattern recognition which seeks to localize objects using models trained with few labeled data.
We propose a novel sparse context transformer (SCT) that effectively leverages object knowledge in the source domain, and automatically learns a sparse context from only few training images in the target domain.
We evaluate the proposed method on two challenging few-shot object detection benchmarks, and empirical results show that the proposed method obtains competitive performance compared to the related state-of-the-art.
- Score: 37.106378859592965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot detection is a major task in pattern recognition which seeks to
localize objects using models trained with few labeled data. One of the
mainstream few-shot methods is transfer learning which consists in pretraining
a detection model in a source domain prior to its fine-tuning in a target
domain. However, it is challenging for fine-tuned models to effectively
identify new classes in the target domain, particularly when the underlying
labeled training data are scarce. In this paper, we devise a novel sparse
context transformer (SCT) that effectively leverages object knowledge in the
source domain, and automatically learns a sparse context from only few training
images in the target domain. As a result, it combines different relevant clues
in order to enhance the discrimination power of the learned detectors and
reduce class confusion. We evaluate the proposed method on two challenging
few-shot object detection benchmarks, and empirical results show that the
proposed method obtains competitive performance compared to the related
state-of-the-art.
Related papers
- Improved Region Proposal Network for Enhanced Few-Shot Object Detection [23.871860648919593]
Few-shot object detection (FSOD) methods have emerged as a solution to the limitations of classic object detection approaches.
We develop a semi-supervised algorithm to detect and then utilize unlabeled novel objects as positive samples during the FSOD training stage.
Our improved hierarchical sampling strategy for the region proposal network (RPN) also boosts the perception of the object detection model for large objects.
arXiv Detail & Related papers (2023-08-15T02:35:59Z) - Identification of Novel Classes for Improving Few-Shot Object Detection [12.013345715187285]
Few-shot object detection (FSOD) methods offer a remedy by realizing robust object detection using only a few training samples per class.
We develop a semi-supervised algorithm to detect and then utilize unlabeled novel objects as positive samples during training to improve FSOD performance.
Our experimental results indicate that our method is effective and outperforms the existing state-of-the-art (SOTA) FSOD methods.
arXiv Detail & Related papers (2023-03-18T14:12:52Z) - CLIP the Gap: A Single Domain Generalization Approach for Object
Detection [60.20931827772482]
Single Domain Generalization tackles the problem of training a model on a single source domain so that it generalizes to any unseen target domain.
We propose to leverage a pre-trained vision-language model to introduce semantic domain concepts via textual prompts.
We achieve this via a semantic augmentation strategy acting on the features extracted by the detector backbone, as well as a text-based classification loss.
arXiv Detail & Related papers (2023-01-13T12:01:18Z) - Cross Domain Object Detection by Target-Perceived Dual Branch
Distillation [49.68119030818388]
Cross domain object detection is a realistic and challenging task in the wild.
We propose a novel Target-perceived Dual-branch Distillation (TDD) framework.
Our TDD significantly outperforms the state-of-the-art methods on all the benchmarks.
arXiv Detail & Related papers (2022-05-03T03:51:32Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Aligning Pretraining for Detection via Object-Level Contrastive Learning [57.845286545603415]
Image-level contrastive representation learning has proven to be highly effective as a generic model for transfer learning.
We argue that this could be sub-optimal and thus advocate a design principle which encourages alignment between the self-supervised pretext task and the downstream task.
Our method, called Selective Object COntrastive learning (SoCo), achieves state-of-the-art results for transfer performance on COCO detection.
arXiv Detail & Related papers (2021-06-04T17:59:52Z) - Instance Localization for Self-supervised Detection Pretraining [68.24102560821623]
We propose a new self-supervised pretext task, called instance localization.
We show that integration of bounding boxes into pretraining promotes better task alignment and architecture alignment for transfer learning.
Experimental results demonstrate that our approach yields state-of-the-art transfer learning results for object detection.
arXiv Detail & Related papers (2021-02-16T17:58:57Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z) - Context-Transformer: Tackling Object Confusion for Few-Shot Detection [0.0]
We propose a novel Context-Transformer within a concise deep transfer framework.
Context-Transformer can effectively leverage source-domain object knowledge as guidance.
It can adaptively integrate these relational clues to enhance the discriminative power of detector.
arXiv Detail & Related papers (2020-03-16T16:17:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.