Training on Thin Air: Improve Image Classification with Generated Data
- URL: http://arxiv.org/abs/2305.15316v1
- Date: Wed, 24 May 2023 16:33:02 GMT
- Title: Training on Thin Air: Improve Image Classification with Generated Data
- Authors: Yongchao Zhou, Hshmat Sahak, Jimmy Ba
- Abstract summary: Diffusion Inversion is a simple yet effective method to generate diverse, high-quality training data for image classification.
Our approach captures the original data distribution and ensures data coverage by inverting images to the latent space of Stable Diffusion.
We identify three key components that allow our generated images to successfully supplant the original dataset.
- Score: 28.96941414724037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Acquiring high-quality data for training discriminative models is a crucial
yet challenging aspect of building effective predictive systems. In this paper,
we present Diffusion Inversion, a simple yet effective method that leverages
the pre-trained generative model, Stable Diffusion, to generate diverse,
high-quality training data for image classification. Our approach captures the
original data distribution and ensures data coverage by inverting images to the
latent space of Stable Diffusion, and generates diverse novel training images
by conditioning the generative model on noisy versions of these vectors. We
identify three key components that allow our generated images to successfully
supplant the original dataset, leading to a 2-3x enhancement in sample
complexity and a 6.5x decrease in sampling time. Moreover, our approach
consistently outperforms generic prompt-based steering methods and KNN
retrieval baseline across a wide range of datasets. Additionally, we
demonstrate the compatibility of our approach with widely-used data
augmentation techniques, as well as the reliability of the generated data in
supporting various neural architectures and enhancing few-shot learning.
Related papers
- A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance.
We propose a simple yet effective data augmentation approach by leveraging advancements in generative models.
Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z) - Gradient Inversion of Federated Diffusion Models [4.1355611383748005]
Diffusion models are becoming defector generative models, which generate exceptionally high-resolution image data.
In this paper, we study the privacy risk of gradient inversion attacks.
We propose a triple-optimization GIDM+ that coordinates the optimization of the unknown data.
arXiv Detail & Related papers (2024-05-30T18:00:03Z) - YaART: Yet Another ART Rendering Technology [119.09155882164573]
This study introduces YaART, a novel production-grade text-to-image cascaded diffusion model aligned to human preferences.
We analyze how these choices affect both the efficiency of the training process and the quality of the generated images.
We demonstrate that models trained on smaller datasets of higher-quality images can successfully compete with those trained on larger datasets.
arXiv Detail & Related papers (2024-04-08T16:51:19Z) - DetDiffusion: Synergizing Generative and Perceptive Models for Enhanced Data Generation and Perception [78.26734070960886]
Current perceptive models heavily depend on resource-intensive datasets.
We introduce perception-aware loss (P.A. loss) through segmentation, improving both quality and controllability.
Our method customizes data augmentation by extracting and utilizing perception-aware attribute (P.A. Attr) during generation.
arXiv Detail & Related papers (2024-03-20T04:58:03Z) - One Category One Prompt: Dataset Distillation using Diffusion Models [22.512552596310176]
We introduce Diffusion Models (D3M) as a novel paradigm for dataset distillation, leveraging recent advancements in generative text-to-image foundation models.
Our approach utilizes textual inversion, a technique for fine-tuning text-to-image generative models, to create concise and informative representations for large datasets.
arXiv Detail & Related papers (2024-03-11T20:23:59Z) - Distribution-Aware Data Expansion with Diffusion Models [55.979857976023695]
We propose DistDiff, a training-free data expansion framework based on the distribution-aware diffusion model.
DistDiff consistently enhances accuracy across a diverse range of datasets compared to models trained solely on original data.
arXiv Detail & Related papers (2024-03-11T14:07:53Z) - Learning-Based Biharmonic Augmentation for Point Cloud Classification [79.13962913099378]
Biharmonic Augmentation (BA) is a novel and efficient data augmentation technique.
BA diversifies point cloud data by imposing smooth non-rigid deformations on existing 3D structures.
We present AdvTune, an advanced online augmentation system that integrates adversarial training.
arXiv Detail & Related papers (2023-11-10T14:04:49Z) - DiffuGen: Adaptable Approach for Generating Labeled Image Datasets using
Stable Diffusion Models [2.0935496890864207]
"DiffuGen" is a simple and adaptable approach that harnesses the power of stable diffusion models to create labeled image datasets efficiently.
By leveraging stable diffusion models, our approach not only ensures the quality of generated datasets but also provides a versatile solution for label generation.
arXiv Detail & Related papers (2023-09-01T04:42:03Z) - Multimodal Data Augmentation for Image Captioning using Diffusion Models [12.221685807426264]
We propose a data augmentation method, leveraging a text-to-image model called Stable Diffusion, to expand the training set.
Experiments on the MS COCO dataset demonstrate the advantages of our approach over several benchmark methods.
Further improvement regarding the training efficiency and effectiveness can be obtained after intentionally filtering the generated data.
arXiv Detail & Related papers (2023-05-03T01:57:33Z) - LayoutDiffuse: Adapting Foundational Diffusion Models for
Layout-to-Image Generation [24.694298869398033]
Our method trains efficiently, generates images with both high perceptual quality and layout alignment.
Our method significantly outperforms other 10 generative models based on GANs, VQ-VAE, and diffusion models.
arXiv Detail & Related papers (2023-02-16T14:20:25Z) - Effective Data Augmentation With Diffusion Models [65.09758931804478]
We address the lack of diversity in data augmentation with image-to-image transformations parameterized by pre-trained text-to-image diffusion models.
Our method edits images to change their semantics using an off-the-shelf diffusion model, and generalizes to novel visual concepts from a few labelled examples.
We evaluate our approach on few-shot image classification tasks, and on a real-world weed recognition task, and observe an improvement in accuracy in tested domains.
arXiv Detail & Related papers (2023-02-07T20:42:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.