PromptMix: Text-to-image diffusion models enhance the performance of
lightweight networks
- URL: http://arxiv.org/abs/2301.12914v2
- Date: Tue, 31 Jan 2023 12:33:01 GMT
- Title: PromptMix: Text-to-image diffusion models enhance the performance of
lightweight networks
- Authors: Arian Bakhtiarnia, Qi Zhang, and Alexandros Iosifidis
- Abstract summary: Deep learning tasks require annotations that are too time consuming for human operators.
In this paper, we introduce PromptMix, a method for artificially boosting the size of existing datasets.
We show that PromptMix can significantly increase the performance of lightweight networks by up to 26%.
- Score: 83.08625720856445
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many deep learning tasks require annotations that are too time consuming for
human operators, resulting in small dataset sizes. This is especially true for
dense regression problems such as crowd counting which requires the location of
every person in the image to be annotated. Techniques such as data augmentation
and synthetic data generation based on simulations can help in such cases. In
this paper, we introduce PromptMix, a method for artificially boosting the size
of existing datasets, that can be used to improve the performance of
lightweight networks. First, synthetic images are generated in an end-to-end
data-driven manner, where text prompts are extracted from existing datasets via
an image captioning deep network, and subsequently introduced to text-to-image
diffusion models. The generated images are then annotated using one or more
high-performing deep networks, and mixed with the real dataset for training the
lightweight network. By extensive experiments on five datasets and two tasks,
we show that PromptMix can significantly increase the performance of
lightweight networks by up to 26%.
Related papers
- Is Synthetic Image Useful for Transfer Learning? An Investigation into Data Generation, Volume, and Utilization [62.157627519792946]
We introduce a novel framework called bridged transfer, which initially employs synthetic images for fine-tuning a pre-trained model to improve its transferability.
We propose dataset style inversion strategy to improve the stylistic alignment between synthetic and real images.
Our proposed methods are evaluated across 10 different datasets and 5 distinct models, demonstrating consistent improvements.
arXiv Detail & Related papers (2024-03-28T22:25:05Z) - Exposure Bracketing is All You Need for Unifying Image Restoration and Enhancement Tasks [50.822601495422916]
We propose to utilize exposure bracketing photography to unify image restoration and enhancement tasks.
Due to the difficulty in collecting real-world pairs, we suggest a solution that first pre-trains the model with synthetic paired data.
In particular, a temporally modulated recurrent network (TMRNet) and self-supervised adaptation method are proposed.
arXiv Detail & Related papers (2024-01-01T14:14:35Z) - Semantic Generative Augmentations for Few-Shot Counting [0.0]
We investigate how synthetic data can benefit few-shot class-agnostic counting.
We propose to rely on a double conditioning of Stable Diffusion with both a prompt and a density map.
Our experiments show that our diversified generation strategy significantly improves the counting accuracy of two recent and performing few-shot counting models.
arXiv Detail & Related papers (2023-10-26T11:42:48Z) - DatasetDM: Synthesizing Data with Perception Annotations Using Diffusion
Models [61.906934570771256]
We present a generic dataset generation model that can produce diverse synthetic images and perception annotations.
Our method builds upon the pre-trained diffusion model and extends text-guided image synthesis to perception data generation.
We show that the rich latent code of the diffusion model can be effectively decoded as accurate perception annotations using a decoder module.
arXiv Detail & Related papers (2023-08-11T14:38:11Z) - RADiff: Controllable Diffusion Models for Radio Astronomical Maps
Generation [6.128112213696457]
RADiff is a generative approach based on conditional diffusion models trained over an annotated radio dataset.
We show that it is possible to generate fully-synthetic image-annotation pairs to automatically augment any annotated dataset.
arXiv Detail & Related papers (2023-07-05T16:04:44Z) - Lafite2: Few-shot Text-to-Image Generation [132.14211027057766]
We propose a novel method for pre-training text-to-image generation model on image-only datasets.
It considers a retrieval-then-optimization procedure to synthesize pseudo text features.
It can be beneficial to a wide range of settings, including the few-shot, semi-supervised and fully-supervised learning.
arXiv Detail & Related papers (2022-10-25T16:22:23Z) - Leveraging Image Complexity in Macro-Level Neural Network Design for
Medical Image Segmentation [3.974175960216864]
We show that image complexity can be used as a guideline in choosing what is best for a given dataset.
For high-complexity datasets, a shallow network running on the original images may yield better segmentation results than a deep network running on downsampled images.
arXiv Detail & Related papers (2021-12-21T09:49:47Z) - Using GANs to Augment Data for Cloud Image Segmentation Task [2.294014185517203]
We show the effectiveness of using Generative Adversarial Networks (GANs) to generate data to augment the training set.
We also present a way to estimate ground-truth binary maps for the GAN-generated images to facilitate their effective use as augmented images.
arXiv Detail & Related papers (2021-06-06T09:01:43Z) - Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF.
We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials.
Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.