DomainStudio: Fine-Tuning Diffusion Models for Domain-Driven Image
Generation using Limited Data
- URL: http://arxiv.org/abs/2306.14153v4
- Date: Tue, 16 Jan 2024 08:57:11 GMT
- Title: DomainStudio: Fine-Tuning Diffusion Models for Domain-Driven Image
Generation using Limited Data
- Authors: Jingyuan Zhu, Huimin Ma, Jiansheng Chen, Jian Yuan
- Abstract summary: This paper proposes a novel DomainStudio approach to adapt DDPMs pre-trained on large-scale source datasets to target domains using limited data.
It is designed to keep the diversity of subjects provided by source domains and get high-quality and diverse adapted samples in target domains.
- Score: 20.998032566820907
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Denoising diffusion probabilistic models (DDPMs) have been proven capable of
synthesizing high-quality images with remarkable diversity when trained on
large amounts of data. Typical diffusion models and modern large-scale
conditional generative models like text-to-image generative models are
vulnerable to overfitting when fine-tuned on extremely limited data. Existing
works have explored subject-driven generation using a reference set containing
a few images. However, few prior works explore DDPM-based domain-driven
generation, which aims to learn the common features of target domains while
maintaining diversity. This paper proposes a novel DomainStudio approach to
adapt DDPMs pre-trained on large-scale source datasets to target domains using
limited data. It is designed to keep the diversity of subjects provided by
source domains and get high-quality and diverse adapted samples in target
domains. We propose to keep the relative distances between adapted samples to
achieve considerable generation diversity. In addition, we further enhance the
learning of high-frequency details for better generation quality. Our approach
is compatible with both unconditional and conditional diffusion models. This
work makes the first attempt to realize unconditional few-shot image generation
with diffusion models, achieving better quality and greater diversity than
current state-of-the-art GAN-based approaches. Moreover, this work also
significantly relieves overfitting for conditional generation and realizes
high-quality domain-driven generation, further expanding the applicable
scenarios of modern large-scale text-to-image models.
Related papers
- FDS: Feedback-guided Domain Synthesis with Multi-Source Conditional Diffusion Models for Domain Generalization [19.0284321951354]
Domain Generalization techniques aim to enhance model robustness by simulating novel data distributions during training.
We propose FDS, Feedback-guided Domain Synthesis, a novel strategy that employs diffusion models to synthesize novel, pseudo-domains.
Our evaluations demonstrate that this methodology sets new benchmarks in domain generalization performance across a range of challenging datasets.
arXiv Detail & Related papers (2024-07-04T02:45:29Z) - Regularized Training with Generated Datasets for Name-Only Transfer of Vision-Language Models [36.59260354292177]
Recent advancements in text-to-image generation have inspired researchers to generate datasets tailored for perception models using generative models.
We aim to fine-tune vision-language models to a specific classification model without access to any real images.
Despite the high fidelity of generated images, we observed a significant performance degradation when fine-tuning the model using the generated datasets.
arXiv Detail & Related papers (2024-06-08T10:43:49Z) - Source-Free Domain Adaptation with Diffusion-Guided Source Data Generation [6.087274577167399]
This paper introduces a novel approach to leverage the generalizability of Diffusion Models for Source-Free Domain Adaptation (DM-SFDA)
Our proposed DMSFDA method involves fine-tuning a pre-trained text-to-image diffusion model to generate source domain images.
We validate our approach through comprehensive experiments across a range of datasets, including Office-31, Office-Home, and VisDA.
arXiv Detail & Related papers (2024-02-07T14:56:13Z) - Steered Diffusion: A Generalized Framework for Plug-and-Play Conditional
Image Synthesis [62.07413805483241]
Steered Diffusion is a framework for zero-shot conditional image generation using a diffusion model trained for unconditional generation.
We present experiments using steered diffusion on several tasks including inpainting, colorization, text-guided semantic editing, and image super-resolution.
arXiv Detail & Related papers (2023-09-30T02:03:22Z) - Phasic Content Fusing Diffusion Model with Directional Distribution
Consistency for Few-Shot Model Adaption [73.98706049140098]
We propose a novel phasic content fusing few-shot diffusion model with directional distribution consistency loss.
Specifically, we design a phasic training strategy with phasic content fusion to help our model learn content and style information when t is large.
Finally, we propose a cross-domain structure guidance strategy that enhances structure consistency during domain adaptation.
arXiv Detail & Related papers (2023-09-07T14:14:11Z) - Conditional Generation from Unconditional Diffusion Models using
Denoiser Representations [94.04631421741986]
We propose adapting pre-trained unconditional diffusion models to new conditions using the learned internal representations of the denoiser network.
We show that augmenting the Tiny ImageNet training set with synthetic images generated by our approach improves the classification accuracy of ResNet baselines by up to 8%.
arXiv Detail & Related papers (2023-06-02T20:09:57Z) - Few-shot Image Generation with Diffusion Models [18.532357455856836]
Denoising diffusion probabilistic models (DDPMs) have been proven capable of synthesizing high-quality images with remarkable diversity when trained on large amounts of data.
Modern approaches are mainly built on Generative Adversarial Networks (GANs) and adapt models pre-trained on large source domains to target domains using a few available samples.
In this paper, we make the first attempt to study when do DDPMs overfit and suffer severe diversity degradation as training data become scarce.
arXiv Detail & Related papers (2022-11-07T02:18:27Z) - A Survey on Generative Diffusion Model [75.93774014861978]
Diffusion models are an emerging class of deep generative models.
They have certain limitations, including a time-consuming iterative generation process and confinement to high-dimensional Euclidean space.
This survey presents a plethora of advanced techniques aimed at enhancing diffusion models.
arXiv Detail & Related papers (2022-09-06T16:56:21Z) - Auto-regressive Image Synthesis with Integrated Quantization [55.51231796778219]
This paper presents a versatile framework for conditional image generation.
It incorporates the inductive bias of CNNs and powerful sequence modeling of auto-regression.
Our method achieves superior diverse image generation performance as compared with the state-of-the-art.
arXiv Detail & Related papers (2022-07-21T22:19:17Z) - Few-shot Image Generation via Cross-domain Correspondence [98.2263458153041]
Training generative models, such as GANs, on a target domain containing limited examples can easily result in overfitting.
In this work, we seek to utilize a large source domain for pretraining and transfer the diversity information from source to target.
To further reduce overfitting, we present an anchor-based strategy to encourage different levels of realism over different regions in the latent space.
arXiv Detail & Related papers (2021-04-13T17:59:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.