Scaling Laws of Synthetic Images for Model Training ... for Now
- URL: http://arxiv.org/abs/2312.04567v1
- Date: Thu, 7 Dec 2023 18:59:59 GMT
- Title: Scaling Laws of Synthetic Images for Model Training ... for Now
- Authors: Lijie Fan, Kaifeng Chen, Dilip Krishnan, Dina Katabi, Phillip Isola,
Yonglong Tian
- Abstract summary: We study the scaling laws of synthetic images generated by state of the art text-to-image models.
We observe that synthetic images demonstrate a scaling trend similar to, but slightly less effective than, real images in CLIP training.
- Score: 54.43596959598466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent significant advances in text-to-image models unlock the possibility of
training vision systems using synthetic images, potentially overcoming the
difficulty of collecting curated data at scale. It is unclear, however, how
these models behave at scale, as more synthetic data is added to the training
set. In this paper we study the scaling laws of synthetic images generated by
state of the art text-to-image models, for the training of supervised models:
image classifiers with label supervision, and CLIP with language supervision.
We identify several factors, including text prompts, classifier-free guidance
scale, and types of text-to-image models, that significantly affect scaling
behavior. After tuning these factors, we observe that synthetic images
demonstrate a scaling trend similar to, but slightly less effective than, real
images in CLIP training, while they significantly underperform in scaling when
training supervised image classifiers. Our analysis indicates that the main
reason for this underperformance is the inability of off-the-shelf
text-to-image models to generate certain concepts, a limitation that
significantly impairs the training of image classifiers. Our findings also
suggest that scaling synthetic data can be particularly effective in scenarios
such as: (1) when there is a limited supply of real images for a supervised
problem (e.g., fewer than 0.5 million images in ImageNet), (2) when the
evaluation dataset diverges significantly from the training data, indicating
the out-of-distribution scenario, or (3) when synthetic data is used in
conjunction with real images, as demonstrated in the training of CLIP models.
Related papers
- DataDream: Few-shot Guided Dataset Generation [90.09164461462365]
We propose a framework for synthesizing classification datasets that more faithfully represents the real data distribution.
DataDream fine-tunes LoRA weights for the image generation model on the few real images before generating the training data using the adapted model.
We then fine-tune LoRA weights for CLIP using the synthetic data to improve downstream image classification over previous approaches on a large variety of datasets.
arXiv Detail & Related papers (2024-07-15T17:10:31Z) - Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - Image Captions are Natural Prompts for Text-to-Image Models [70.30915140413383]
We analyze the relationship between the training effect of synthetic data and the synthetic data distribution induced by prompts.
We propose a simple yet effective method that prompts text-to-image generative models to synthesize more informative and diverse training data.
Our method significantly improves the performance of models trained on synthetic training data.
arXiv Detail & Related papers (2023-07-17T14:38:11Z) - Evaluating Data Attribution for Text-to-Image Models [62.844382063780365]
We evaluate attribution through "customization" methods, which tune an existing large-scale model toward a given exemplar object or style.
Our key insight is that this allows us to efficiently create synthetic images that are computationally influenced by the exemplar by construction.
By taking into account the inherent uncertainty of the problem, we can assign soft attribution scores over a set of training images.
arXiv Detail & Related papers (2023-06-15T17:59:51Z) - Synthetic Image Data for Deep Learning [0.294944680995069]
Realistic synthetic image data rendered from 3D models can be used to augment image sets and train image classification semantic segmentation models.
We show how high quality physically-based rendering and domain randomization can efficiently create a large synthetic dataset based on production 3D CAD models of a real vehicle.
arXiv Detail & Related papers (2022-12-12T20:28:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.