CD-FSOD: A Benchmark for Cross-domain Few-shot Object Detection
- URL: http://arxiv.org/abs/2210.05311v3
- Date: Wed, 3 May 2023 09:19:05 GMT
- Title: CD-FSOD: A Benchmark for Cross-domain Few-shot Object Detection
- Authors: Wuti Xiong
- Abstract summary: We evaluate state-of-art FSOD approaches, including meta-learning FSOD approaches and fine-tuning FSOD approaches.
Our approach is remarkably superior to existing approaches by significant margins.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a study of the cross-domain few-shot object
detection (CD-FSOD) benchmark, consisting of image data from a diverse data
domain. On the proposed benchmark, we evaluate state-of-art FSOD approaches,
including meta-learning FSOD approaches and fine-tuning FSOD approaches. The
results show that these methods tend to fall, and even underperform the naive
fine-tuning model. We analyze the reasons for their failure and introduce a
strong baseline that uses a mutually-beneficial manner to alleviate the
overfitting problem. Our approach is remarkably superior to existing approaches
by significant margins (2.0\% on average) on the proposed benchmark. Our code
is available at \url{https://github.com/FSOD/CD-FSOD}.
Related papers
- On the Robustness of Human-Object Interaction Detection against Distribution Shift [27.40641711088878]
Human-Object Interaction (HOI) detection has seen substantial advances in recent years.<n>Existing works focus on the standard setting with ideal images and natural distribution, far from practical scenarios with inevitable distribution shifts.<n>In this work, we investigate this issue by benchmarking, analyzing, and enhancing the robustness of HOI detection models under various distribution shifts.
arXiv Detail & Related papers (2025-06-22T13:01:34Z) - Enhance Then Search: An Augmentation-Search Strategy with Foundation Models for Cross-Domain Few-Shot Object Detection [13.980798935767558]
Foundation models pretrained on extensive datasets have performed remarkably in the cross-domain few-shot object detection task.
We found that the integration of image-based data augmentation techniques and grid-based sub-domain search strategy significantly enhances the performance of these foundation models.
Our findings substantially advance the practical deployment of vision-language models in data-scarce environments.
arXiv Detail & Related papers (2025-04-06T15:30:35Z) - Enhanced OoD Detection through Cross-Modal Alignment of Multi-Modal Representations [2.992602379681373]
We show that multi-modal fine-tuning can achieve notable OoDD performance.
We propose a training objective that enhances cross-modal alignment by regularizing the distances between image and text embeddings of ID data.
arXiv Detail & Related papers (2025-03-24T16:00:21Z) - Understanding the Cross-Domain Capabilities of Video-Based Few-Shot Action Recognition Models [3.072340427031969]
Few-shot action recognition (FSAR) aims to learn a model capable of identifying novel actions in videos using only a few examples.
In assuming the base dataset seen during meta-training and novel dataset used for evaluation can come from different domains, cross-domain few-shot learning alleviates data collection and annotation costs.
We systematically evaluate existing state-of-the-art single-domain, transfer-based, and cross-domain FSAR methods on new cross-domain tasks.
arXiv Detail & Related papers (2024-06-03T07:48:18Z) - Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector [72.05791402494727]
This paper studies the challenging cross-domain few-shot object detection (CD-FSOD)
It aims to develop an accurate object detector for novel domains with minimal labeled examples.
arXiv Detail & Related papers (2024-02-05T15:25:32Z) - Dense Affinity Matching for Few-Shot Segmentation [83.65203917246745]
Few-Shot (FSS) aims to segment the novel class images with a few samples.
We propose a dense affinity matching framework to exploit the support-query interaction.
We show that our framework performs very competitively under different settings with only 0.68M parameters.
arXiv Detail & Related papers (2023-07-17T12:27:15Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - A Simple Information-Based Approach to Unsupervised Domain-Adaptive
Aspect-Based Sentiment Analysis [58.124424775536326]
We propose a simple but effective technique based on mutual information to extract their term.
Experiment results show that our proposed method outperforms the state-of-the-art methods for cross-domain ABSA by 4.32% Micro-F1.
arXiv Detail & Related papers (2022-01-29T10:18:07Z) - Plug-and-Play Few-shot Object Detection with Meta Strategy and Explicit
Localization Inference [78.41932738265345]
This paper proposes a plug detector that can accurately detect the objects of novel categories without fine-tuning process.
We introduce two explicit inferences into the localization process to reduce its dependence on annotated data.
It shows a significant lead in both efficiency, precision, and recall under varied evaluation protocols.
arXiv Detail & Related papers (2021-10-26T03:09:57Z) - An Enhanced Span-based Decomposition Method for Few-Shot Sequence
Labeling [27.468499201647063]
Few-Shot Sequence Labeling (FSSL) is a canonical solution for the tagging models to generalize on an emerging, resource-scarce domain.
We propose Enhanced Span-based Decomposition method, which follows the metric-based meta-learning paradigm for FSSL.
arXiv Detail & Related papers (2021-09-27T12:59:48Z) - SimROD: A Simple Adaptation Method for Robust Object Detection [8.307942341807152]
This paper presents a simple and effective unsupervised adaptation method for Robust Object Detection (SimROD)
Our method integrates a novel domain-centric augmentation method, a gradual self-labeling adaptation procedure, and a teacher-guided fine-tuning mechanism.
When applied to image corruptions and high-level cross-domain adaptation benchmarks, our method outperforms prior baselines.
arXiv Detail & Related papers (2021-07-28T14:28:32Z) - Multi-Scale Positive Sample Refinement for Few-Shot Object Detection [61.60255654558682]
Few-shot object detection (FSOD) helps detectors adapt to unseen classes with few training instances.
We propose a Multi-scale Positive Sample Refinement (MPSR) approach to enrich object scales in FSOD.
MPSR generates multi-scale positive samples as object pyramids and refines the prediction at various scales.
arXiv Detail & Related papers (2020-07-18T09:48:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.