Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations
- URL: http://arxiv.org/abs/2308.10554v1
- Date: Mon, 21 Aug 2023 08:12:28 GMT
- Title: Improving Diversity in Zero-Shot GAN Adaptation with Semantic Variations
- Authors: Seogkyu Jeon, Bei Liu, Pilhyeon Lee, Kibeom Hong, Jianlong Fu, Hyeran
Byun
- Abstract summary: zero-shot GAN adaptation aims to reuse well-trained generators to synthesize images of an unseen target domain.
With only a single representative text feature instead of real images, the synthesized images gradually lose diversity.
We propose a novel method to find semantic variations of the target text in the CLIP space.
- Score: 61.132408427908175
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training deep generative models usually requires a large amount of data. To
alleviate the data collection cost, the task of zero-shot GAN adaptation aims
to reuse well-trained generators to synthesize images of an unseen target
domain without any further training samples. Due to the data absence, the
textual description of the target domain and the vision-language models, e.g.,
CLIP, are utilized to effectively guide the generator. However, with only a
single representative text feature instead of real images, the synthesized
images gradually lose diversity as the model is optimized, which is also known
as mode collapse. To tackle the problem, we propose a novel method to find
semantic variations of the target text in the CLIP space. Specifically, we
explore diverse semantic variations based on the informative text feature of
the target domain while regularizing the uncontrolled deviation of the semantic
information. With the obtained variations, we design a novel directional moment
loss that matches the first and second moments of image and text direction
distributions. Moreover, we introduce elastic weight consolidation and a
relation consistency loss to effectively preserve valuable content information
from the source domain, e.g., appearances. Through extensive experiments, we
demonstrate the efficacy of the proposed methods in ensuring sample diversity
in various scenarios of zero-shot GAN adaptation. We also conduct ablation
studies to validate the effect of each proposed component. Notably, our model
achieves a new state-of-the-art on zero-shot GAN adaptation in terms of both
diversity and quality.
Related papers
- DALDA: Data Augmentation Leveraging Diffusion Model and LLM with Adaptive Guidance Scaling [6.7206291284535125]
We present an effective data augmentation framework leveraging the Large Language Model (LLM) and Diffusion Model (DM)
Our approach addresses the issue of increasing the diversity of synthetic images.
Our method produces synthetic images with enhanced diversity while maintaining adherence to the target distribution.
arXiv Detail & Related papers (2024-09-25T14:02:43Z) - Few-Shot Image Generation by Conditional Relaxing Diffusion Inversion [37.18537753482751]
Conditional Diffusion Relaxing Inversion (CRDI) is designed to enhance distribution diversity in synthetic image generation.
CRDI does not rely on fine-tuning based on only a few samples.
It focuses on reconstructing each target image instance and expanding diversity through few-shot learning.
arXiv Detail & Related papers (2024-07-09T21:58:26Z) - Adaptive Prompt Learning with Negative Textual Semantics and Uncertainty Modeling for Universal Multi-Source Domain Adaptation [15.773845409601389]
Universal Multi-source Domain Adaptation (UniMDA) transfers knowledge from multiple labeled source domains to an unlabeled target domain.
Existing solutions focus on excavating image features to detect unknown samples, ignoring abundant information contained in textual semantics.
We propose an Adaptive Prompt learning with Negative textual semantics and uncErtainty modeling method for UniMDA classification tasks.
arXiv Detail & Related papers (2024-04-23T02:54:12Z) - Language Guided Domain Generalized Medical Image Segmentation [68.93124785575739]
Single source domain generalization holds promise for more reliable and consistent image segmentation across real-world clinical settings.
We propose an approach that explicitly leverages textual information by incorporating a contrastive learning mechanism guided by the text encoder features.
Our approach achieves favorable performance against existing methods in literature.
arXiv Detail & Related papers (2024-04-01T17:48:15Z) - Text-guided Explorable Image Super-resolution [14.83045604603449]
We propose two approaches for zero-shot text-guided super-resolution.
We show that the proposed approaches result in diverse solutions that match the semantic meaning provided by the text prompt.
arXiv Detail & Related papers (2024-03-02T08:10:54Z) - Diversified in-domain synthesis with efficient fine-tuning for few-shot
classification [64.86872227580866]
Few-shot image classification aims to learn an image classifier using only a small set of labeled examples per class.
We propose DISEF, a novel approach which addresses the generalization challenge in few-shot learning using synthetic data.
We validate our method in ten different benchmarks, consistently outperforming baselines and establishing a new state-of-the-art for few-shot classification.
arXiv Detail & Related papers (2023-12-05T17:18:09Z) - Phasic Content Fusing Diffusion Model with Directional Distribution
Consistency for Few-Shot Model Adaption [73.98706049140098]
We propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss.
Specifically, we design a phasic training strategy with phasic content fusion to help our model learn content and style information when t is large.
Finally, we propose a cross-domain structure guidance strategy that enhances structure consistency during domain adaptation.
arXiv Detail & Related papers (2023-09-07T14:14:11Z) - Consistency Regularization for Generalizable Source-free Domain
Adaptation [62.654883736925456]
Source-free domain adaptation (SFDA) aims to adapt a well-trained source model to an unlabelled target domain without accessing the source dataset.
Existing SFDA methods ONLY assess their adapted models on the target training set, neglecting the data from unseen but identically distributed testing sets.
We propose a consistency regularization framework to develop a more generalizable SFDA method.
arXiv Detail & Related papers (2023-08-03T07:45:53Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Consistency Regularization with High-dimensional Non-adversarial
Source-guided Perturbation for Unsupervised Domain Adaptation in Segmentation [15.428323201750144]
BiSIDA employs consistency regularization to efficiently exploit information from the unlabeled target dataset.
BiSIDA achieves new state-of-the-art on two commonly-used synthetic-to-real domain adaptation benchmarks.
arXiv Detail & Related papers (2020-09-18T03:26:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.