ODOV: Towards Open-Domain Open-Vocabulary Object Detection
- URL: http://arxiv.org/abs/2508.01253v1
- Date: Sat, 02 Aug 2025 08:10:45 GMT
- Title: ODOV: Towards Open-Domain Open-Vocabulary Object Detection
- Authors: Yupeng Zhang, Ruize Han, Fangnan Zhou, Song Wang, Wei Feng, Liang Wan,
- Abstract summary: We first construct a new benchmark OD-LVIS, which includes 46,949 images, covers 18 complex real-world domains and 1,203 categories.<n>We then develop a novel baseline method for ODOV detection.<n>We provide sufficient benchmark evaluations for the proposed ODOV detection task and report the results.
- Score: 28.25079830063646
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we handle a new problem of Open-Domain Open-Vocabulary (ODOV) object detection, which considers the detection model's adaptability to the real world including both domain and category shifts. For this problem, we first construct a new benchmark OD-LVIS, which includes 46,949 images, covers 18 complex real-world domains and 1,203 categories, and provides a comprehensive dataset for evaluating real-world object detection. Besides, we develop a novel baseline method for ODOV detection.The proposed method first leverages large language models to generate the domain-agnostic text prompts for category embedding. It further learns the domain embedding from the given image, which, during testing, can be integrated into the category embedding to form the customized domain-specific category embedding for each test image. We provide sufficient benchmark evaluations for the proposed ODOV detection task and report the results, which verify the rationale of ODOV detection, the usefulness of our benchmark, and the superiority of the proposed method.
Related papers
- DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment [7.768332621617199]
We introduce a strong DETR-based detector named Domain Adaptive detection TRansformer ( DATR) for unsupervised domain adaptation of object detection.
Our proposed DATR incorporates a mean-teacher based self-training framework, utilizing pseudo-labels generated by the teacher model to further mitigate domain bias.
Experiments demonstrate superior performance and generalization capabilities of our proposed DATR in multiple domain adaptation scenarios.
arXiv Detail & Related papers (2024-05-20T03:48:45Z) - Cross-Domain Few-Shot Object Detection via Enhanced Open-Set Object Detector [72.05791402494727]
This paper studies the challenging cross-domain few-shot object detection (CD-FSOD)
It aims to develop an accurate object detector for novel domains with minimal labeled examples.
arXiv Detail & Related papers (2024-02-05T15:25:32Z) - CLIP the Gap: A Single Domain Generalization Approach for Object
Detection [60.20931827772482]
Single Domain Generalization tackles the problem of training a model on a single source domain so that it generalizes to any unseen target domain.
We propose to leverage a pre-trained vision-language model to introduce semantic domain concepts via textual prompts.
We achieve this via a semantic augmentation strategy acting on the features extracted by the detector backbone, as well as a text-based classification loss.
arXiv Detail & Related papers (2023-01-13T12:01:18Z) - Application of Unsupervised Domain Adaptation for Structural MRI
Analysis [0.0]
We study the effectiveness of an unsupervised domain adaptation approach for various applications such as binary classification and anomaly detection.
We also explore image reconstruction and image synthesis for analyzing and generating 3D structural MRI data.
We successfully demonstrate that domain adaptation improves the performance of AD detection when implemented in both supervised and unsupervised settings.
arXiv Detail & Related papers (2022-12-26T01:59:56Z) - D2DF2WOD: Learning Object Proposals for Weakly-Supervised Object
Detection via Progressive Domain Adaptation [25.41133780678981]
D2DF2WOD is a Fully-to-Weakly Supervised Object Detection framework.
It uses synthetic data, annotated with precise object localization, to supplement a natural image target domain.
Our method results in consistently improved object detection and localization compared with state-of-the-art methods.
arXiv Detail & Related papers (2022-12-02T18:58:03Z) - Domain Generalisation for Object Detection under Covariate and Concept Shift [10.32461766065764]
Domain generalisation aims to promote the learning of domain-invariant features while suppressing domain-specific features.
An approach to domain generalisation for object detection is proposed, the first such approach applicable to any object detection architecture.
arXiv Detail & Related papers (2022-03-10T11:14:18Z) - Decompose to Adapt: Cross-domain Object Detection via Feature
Disentanglement [79.2994130944482]
We design a Domain Disentanglement Faster-RCNN (DDF) to eliminate the source-specific information in the features for detection task learning.
Our DDF method facilitates the feature disentanglement at the global and local stages, with a Global Triplet Disentanglement (GTD) module and an Instance Similarity Disentanglement (ISD) module.
By outperforming state-of-the-art methods on four benchmark UDA object detection tasks, our DDF method is demonstrated to be effective with wide applicability.
arXiv Detail & Related papers (2022-01-06T05:43:01Z) - Towards Novel Target Discovery Through Open-Set Domain Adaptation [73.81537683043206]
Open-set domain adaptation (OSDA) considers that the target domain contains samples from novel categories unobserved in external source domain.
We propose a novel framework to accurately identify the seen categories in target domain, and effectively recover the semantic attributes for unseen categories.
arXiv Detail & Related papers (2021-05-06T04:22:29Z) - Bi-Dimensional Feature Alignment for Cross-Domain Object Detection [71.85594342357815]
We propose a novel unsupervised cross-domain detection model.
It exploits the annotated data in a source domain to train an object detector for a different target domain.
The proposed model mitigates the cross-domain representation divergence for object detection.
arXiv Detail & Related papers (2020-11-14T03:03:11Z) - Cross-domain Detection via Graph-induced Prototype Alignment [114.8952035552862]
We propose a Graph-induced Prototype Alignment (GPA) framework to seek for category-level domain alignment.
In addition, in order to alleviate the negative effect of class-imbalance on domain adaptation, we design a Class-reweighted Contrastive Loss.
Our approach outperforms existing methods with a remarkable margin.
arXiv Detail & Related papers (2020-03-28T17:46:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.