SIDA: Synthetic Image Driven Zero-shot Domain Adaptation
- URL: http://arxiv.org/abs/2507.18632v1
- Date: Thu, 24 Jul 2025 17:59:36 GMT
- Title: SIDA: Synthetic Image Driven Zero-shot Domain Adaptation
- Authors: Ye-Chan Kim, SeungJu Cha, Si-Woo Kim, Taewhan Kim, Dong-Jin Kim,
- Abstract summary: Zero-shot domain adaptation is a method for adapting a model to a target domain without utilizing target domain image data.<n>We propose SIDA, a novel and efficient zero-shot domain adaptation method leveraging synthetic images.<n>We demonstrate the effectiveness of our method by showing state-of-the-art performance in diverse zero-shot adaptation scenarios.
- Score: 5.542712070598464
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Zero-shot domain adaptation is a method for adapting a model to a target domain without utilizing target domain image data. To enable adaptation without target images, existing studies utilize CLIP's embedding space and text description to simulate target-like style features. Despite the previous achievements in zero-shot domain adaptation, we observe that these text-driven methods struggle to capture complex real-world variations and significantly increase adaptation time due to their alignment process. Instead of relying on text descriptions, we explore solutions leveraging image data, which provides diverse and more fine-grained style cues. In this work, we propose SIDA, a novel and efficient zero-shot domain adaptation method leveraging synthetic images. To generate synthetic images, we first create detailed, source-like images and apply image translation to reflect the style of the target domain. We then utilize the style features of these synthetic images as a proxy for the target domain. Based on these features, we introduce Domain Mix and Patch Style Transfer modules, which enable effective modeling of real-world variations. In particular, Domain Mix blends multiple styles to expand the intra-domain representations, and Patch Style Transfer assigns different styles to individual patches. We demonstrate the effectiveness of our method by showing state-of-the-art performance in diverse zero-shot adaptation scenarios, particularly in challenging domains. Moreover, our approach achieves high efficiency by significantly reducing the overall adaptation time.
Related papers
- Zero Shot Domain Adaptive Semantic Segmentation by Synthetic Data Generation and Progressive Adaptation [8.124539956043074]
We present a novel method that tackles zero-shot domain adaptive semantic segmentation, in which no target images are available.<n>We use a pretrained off-the-shelf text-to-image diffusion model, which generates training images by transferring source domain images to target style.<n>To mitigate the impact of noise in synthetic data, we design a progressive adaptation strategy, ensuring robust learning throughout the training process.
arXiv Detail & Related papers (2025-08-05T10:21:09Z) - HyperGAN-CLIP: A Unified Framework for Domain Adaptation, Image Synthesis and Manipulation [21.669044026456557]
Generative Adversarial Networks (GANs) have demonstrated remarkable capabilities in generating highly realistic images.
We present a novel framework that significantly extends the capabilities of a pre-trained StyleGAN by integrating CLIP space via hypernetworks.
Our approach demonstrates unprecedented flexibility, enabling text-guided image manipulation without the need for text-specific training data.
arXiv Detail & Related papers (2024-11-19T19:36:18Z) - Language Guided Domain Generalized Medical Image Segmentation [68.93124785575739]
Single source domain generalization holds promise for more reliable and consistent image segmentation across real-world clinical settings.
We propose an approach that explicitly leverages textual information by incorporating a contrastive learning mechanism guided by the text encoder features.
Our approach achieves favorable performance against existing methods in literature.
arXiv Detail & Related papers (2024-04-01T17:48:15Z) - Phrase Grounding-based Style Transfer for Single-Domain Generalized
Object Detection [109.58348694132091]
Single-domain generalized object detection aims to enhance a model's generalizability to multiple unseen target domains.
This is a practical yet challenging task as it requires the model to address domain shift without incorporating target domain data into training.
We propose a novel phrase grounding-based style transfer approach for the task.
arXiv Detail & Related papers (2024-02-02T10:48:43Z) - Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations [61.132408427908175]
zero-shot GAN adaptation aims to reuse well-trained generators to synthesize images of an unseen target domain.
With only a single representative text feature instead of real images, the synthesized images gradually lose diversity.
We propose a novel method to find semantic variations of the target text in the CLIP space.
arXiv Detail & Related papers (2023-08-21T08:12:28Z) - One-shot Unsupervised Domain Adaptation with Personalized Diffusion
Models [15.590759602379517]
Adapting a segmentation model from a labeled source domain to a target domain is one of the most challenging problems in domain adaptation.
We leverage text-to-image diffusion models to generate a synthetic target dataset with photo-realistic images.
Experiments show that our method surpasses the state-of-the-art OSUDA methods by up to +7.1%.
arXiv Detail & Related papers (2023-03-31T14:16:38Z) - Towards Diverse and Faithful One-shot Adaption of Generative Adversarial
Networks [54.80435295622583]
One-shot generative domain adaption aims to transfer a pre-trained generator on one domain to a new domain using one reference image only.
We present a novel one-shot generative domain adaption method, i.e., DiFa, for diverse generation and faithful adaptation.
arXiv Detail & Related papers (2022-07-18T16:29:41Z) - Adversarial Style Augmentation for Domain Generalized Urban-Scene
Segmentation [120.96012935286913]
We propose a novel adversarial style augmentation approach, which can generate hard stylized images during training.
Experiments on two synthetic-to-real semantic segmentation benchmarks demonstrate that AdvStyle can significantly improve the model performance on unseen real domains.
arXiv Detail & Related papers (2022-07-11T14:01:25Z) - DRANet: Disentangling Representation and Adaptation Networks for
Unsupervised Cross-Domain Adaptation [23.588766224169493]
DRANet is a network architecture that disentangles image representations and transfers the visual attributes in a latent space for unsupervised cross-domain adaptation.
Our model encodes individual representations of content (scene structure) and style (artistic appearance) from both source and target images.
It adapts the domain by incorporating the transferred style factor into the content factor along with learnable weights specified for each domain.
arXiv Detail & Related papers (2021-03-24T18:54:23Z) - CrDoCo: Pixel-level Domain Transfer with Cross-Domain Consistency [119.45667331836583]
Unsupervised domain adaptation algorithms aim to transfer the knowledge learned from one domain to another.
We present a novel pixel-wise adversarial domain adaptation algorithm.
arXiv Detail & Related papers (2020-01-09T19:00:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.