Phrase Grounding-based Style Transfer for Single-Domain Generalized
Object Detection
- URL: http://arxiv.org/abs/2402.01304v2
- Date: Mon, 5 Feb 2024 03:04:58 GMT
- Title: Phrase Grounding-based Style Transfer for Single-Domain Generalized
Object Detection
- Authors: Hao Li, Wei Wang, Cong Wang, Zhigang Luo, Xinwang Liu, Kenli Li and
Xiaochun Cao
- Abstract summary: Single-domain generalized object detection aims to enhance a model's generalizability to multiple unseen target domains.
This is a practical yet challenging task as it requires the model to address domain shift without incorporating target domain data into training.
We propose a novel phrase grounding-based style transfer approach for the task.
- Score: 109.58348694132091
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Single-domain generalized object detection aims to enhance a model's
generalizability to multiple unseen target domains using only data from a
single source domain during training. This is a practical yet challenging task
as it requires the model to address domain shift without incorporating target
domain data into training. In this paper, we propose a novel phrase
grounding-based style transfer (PGST) approach for the task. Specifically, we
first define textual prompts to describe potential objects for each unseen
target domain. Then, we leverage the grounded language-image pre-training
(GLIP) model to learn the style of these target domains and achieve style
transfer from the source to the target domain. The style-transferred source
visual features are semantically rich and could be close to imaginary
counterparts in the target domain. Finally, we employ these style-transferred
visual features to fine-tune GLIP. By introducing imaginary counterparts, the
detector could be effectively generalized to unseen target domains using only a
single source domain for training. Extensive experimental results on five
diverse weather driving benchmarks demonstrate our proposed approach achieves
state-of-the-art performance, even surpassing some domain adaptive methods that
incorporate target domain images into the training process.The source codes and
pre-trained models will be made available.
Related papers
- Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation [40.667166043101076]
We propose a small adapter for rectifying diverse target domain styles to the source domain.
The adapter is trained to rectify the image features from diverse synthesized target domains to align with the source domain.
Our method achieves promising results on cross-domain few-shot semantic segmentation tasks.
arXiv Detail & Related papers (2024-04-16T07:07:40Z) - Online Prototype Alignment for Few-shot Policy Transfer [18.310398679044244]
We propose a novel framework to learn the mapping function based on the functional similarity of elements.
Online Prototype Alignment (OPA) is able to achieve the few-shot policy transfer within only several episodes.
arXiv Detail & Related papers (2023-06-12T11:42:13Z) - Pulling Target to Source: A New Perspective on Domain Adaptive Semantic Segmentation [80.1412989006262]
Domain adaptive semantic segmentation aims to transfer knowledge from a labeled source domain to an unlabeled target domain.
We propose T2S-DA, which we interpret as a form of pulling Target to Source for Domain Adaptation.
arXiv Detail & Related papers (2023-05-23T07:09:09Z) - CLIP the Gap: A Single Domain Generalization Approach for Object
Detection [60.20931827772482]
Single Domain Generalization tackles the problem of training a model on a single source domain so that it generalizes to any unseen target domain.
We propose to leverage a pre-trained vision-language model to introduce semantic domain concepts via textual prompts.
We achieve this via a semantic augmentation strategy acting on the features extracted by the detector backbone, as well as a text-based classification loss.
arXiv Detail & Related papers (2023-01-13T12:01:18Z) - P{\O}DA: Prompt-driven Zero-shot Domain Adaptation [27.524962843495366]
We adapt a model trained on a source domain using only a general description in natural language of the target domain, i.e., a prompt.
We show that these prompt-driven augmentations can be used to perform zero-shot domain adaptation for semantic segmentation.
arXiv Detail & Related papers (2022-12-06T18:59:58Z) - Domain Adaptation via Prompt Learning [39.97105851723885]
Unsupervised domain adaption (UDA) aims to adapt models learned from a well-annotated source domain to a target domain.
We introduce a novel prompt learning paradigm for UDA, named Domain Adaptation via Prompt Learning (DAPL)
arXiv Detail & Related papers (2022-02-14T13:25:46Z) - Generalized Source-free Domain Adaptation [47.907168218249694]
We propose a new domain adaptation paradigm called Generalized Source-free Domain Adaptation (G-SFDA)
For target performance our method is on par with or better than existing DA and SFDA methods, specifically it achieves state-of-the-art performance (85.4%) on VisDA.
arXiv Detail & Related papers (2021-08-03T16:34:12Z) - Surprisingly Simple Semi-Supervised Domain Adaptation with Pretraining
and Consistency [93.89773386634717]
Visual domain adaptation involves learning to classify images from a target visual domain using labels available in a different source domain.
We show that in the presence of a few target labels, simple techniques like self-supervision (via rotation prediction) and consistency regularization can be effective without any adversarial alignment to learn a good target classifier.
Our Pretraining and Consistency (PAC) approach, can achieve state of the art accuracy on this semi-supervised domain adaptation task, surpassing multiple adversarial domain alignment methods, across multiple datasets.
arXiv Detail & Related papers (2021-01-29T18:40:17Z) - Alleviating Semantic-level Shift: A Semi-supervised Domain Adaptation
Method for Semantic Segmentation [97.8552697905657]
A key challenge of this task is how to alleviate the data distribution discrepancy between the source and target domains.
We propose Alleviating Semantic-level Shift (ASS), which can successfully promote the distribution consistency from both global and local views.
We apply our ASS to two domain adaptation tasks, from GTA5 to Cityscapes and from Synthia to Cityscapes.
arXiv Detail & Related papers (2020-04-02T03:25:05Z) - Cross-domain Self-supervised Learning for Domain Adaptation with Few
Source Labels [78.95901454696158]
We propose a novel Cross-Domain Self-supervised learning approach for domain adaptation.
Our method significantly boosts performance of target accuracy in the new target domain with few source labels.
arXiv Detail & Related papers (2020-03-18T15:11:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.