Peer is Your Pillar: A Data-unbalanced Conditional GANs for Few-shot
Image Generation
- URL: http://arxiv.org/abs/2311.08217v1
- Date: Tue, 14 Nov 2023 14:55:42 GMT
- Title: Peer is Your Pillar: A Data-unbalanced Conditional GANs for Few-shot
Image Generation
- Authors: Ziqiang Li, Chaoyue Wang, Xue Rui, Chao Xue, Jiaxu Leng, and Bin Li
- Abstract summary: Few-shot image generation aims to train generative models using a small number of training images.
We propose a novel pipeline called Peer is your Pillar (PIP), which combines a target few-shot dataset with a peer dataset to create a data-unbalanced conditional generation.
- Score: 24.698516678703236
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Few-shot image generation aims to train generative models using a small
number of training images. When there are few images available for training
(e.g. 10 images), Learning From Scratch (LFS) methods often generate images
that closely resemble the training data while Transfer Learning (TL) methods
try to improve performance by leveraging prior knowledge from GANs pre-trained
on large-scale datasets. However, current TL methods may not allow for
sufficient control over the degree of knowledge preservation from the source
model, making them unsuitable for setups where the source and target domains
are not closely related. To address this, we propose a novel pipeline called
Peer is your Pillar (PIP), which combines a target few-shot dataset with a peer
dataset to create a data-unbalanced conditional generation. Our approach
includes a class embedding method that separates the class space from the
latent space, and we use a direction loss based on pre-trained CLIP to improve
image diversity. Experiments on various few-shot datasets demonstrate the
advancement of the proposed PIP, especially reduces the training requirements
of few-shot image generation.
Related papers
- CALLIC: Content Adaptive Learning for Lossless Image Compression [64.47244912937204]
CALLIC sets a new state-of-the-art (SOTA) for learned lossless image compression.
We propose a content-aware autoregressive self-attention mechanism by leveraging convolutional gating operations.
During encoding, we decompose pre-trained layers, including depth-wise convolutions, using low-rank matrices and then adapt the incremental weights on testing image by Rate-guided Progressive Fine-Tuning (RPFT)
RPFT fine-tunes with gradually increasing patches that are sorted in descending order by estimated entropy, optimizing learning process and reducing adaptation time.
arXiv Detail & Related papers (2024-12-23T10:41:18Z) - No Data Augmentation? Alternative Regularizations for Effective Training
on Small Datasets [0.0]
We study alternative regularization strategies to push the limits of supervised learning on small image classification datasets.
In particular, we employ a agnostic to select (semi) optimal learning rate and weight decay couples via the norm of model parameters.
We reach a test accuracy of 66.5%, on par with the best state-of-the-art methods.
arXiv Detail & Related papers (2023-09-04T16:13:59Z) - Training on Thin Air: Improve Image Classification with Generated Data [28.96941414724037]
Diffusion Inversion is a simple yet effective method to generate diverse, high-quality training data for image classification.
Our approach captures the original data distribution and ensures data coverage by inverting images to the latent space of Stable Diffusion.
We identify three key components that allow our generated images to successfully supplant the original dataset.
arXiv Detail & Related papers (2023-05-24T16:33:02Z) - Multimodal Data Augmentation for Image Captioning using Diffusion Models [12.221685807426264]
We propose a data augmentation method, leveraging a text-to-image model called Stable Diffusion, to expand the training set.
Experiments on the MS COCO dataset demonstrate the advantages of our approach over several benchmark methods.
Further improvement regarding the training efficiency and effectiveness can be obtained after intentionally filtering the generated data.
arXiv Detail & Related papers (2023-05-03T01:57:33Z) - DINOv2: Learning Robust Visual Features without Supervision [75.42921276202522]
This work shows that existing pretraining methods, especially self-supervised methods, can produce such features if trained on enough curated data from diverse sources.
Most of the technical contributions aim at accelerating and stabilizing the training at scale.
In terms of data, we propose an automatic pipeline to build a dedicated, diverse, and curated image dataset instead of uncurated data, as typically done in the self-supervised literature.
arXiv Detail & Related papers (2023-04-14T15:12:19Z) - Noise Self-Regression: A New Learning Paradigm to Enhance Low-Light Images Without Task-Related Data [86.68013790656762]
We propose Noise SElf-Regression (NoiSER) without access to any task-related data.
NoiSER is highly competitive in enhancement quality, yet with a much smaller model size, and much lower training and inference cost.
arXiv Detail & Related papers (2022-11-09T06:18:18Z) - Generative Negative Text Replay for Continual Vision-Language
Pretraining [95.2784858069843]
Vision-language pre-training has attracted increasing attention recently.
Massive data are usually collected in a streaming fashion.
We propose a multi-modal knowledge distillation between images and texts to align the instance-wise prediction between old and new models.
arXiv Detail & Related papers (2022-10-31T13:42:21Z) - Feature transforms for image data augmentation [74.12025519234153]
In image classification, many augmentation approaches utilize simple image manipulation algorithms.
In this work, we build ensembles on the data level by adding images generated by combining fourteen augmentation approaches.
Pretrained ResNet50 networks are finetuned on training sets that include images derived from each augmentation method.
arXiv Detail & Related papers (2022-01-24T14:12:29Z) - Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image.
We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively.
Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z) - Self-Supervised Generative Style Transfer for One-Shot Medical Image
Segmentation [10.634870214944055]
In medical image segmentation, supervised deep networks' success comes at the cost of requiring abundant labeled data.
We propose a novel volumetric self-supervised learning for data augmentation capable of synthesizing volumetric image-segmentation pairs.
Our work's central tenet benefits from a combined view of one-shot generative learning and the proposed self-supervised training strategy.
arXiv Detail & Related papers (2021-10-05T15:28:42Z) - Multiclass non-Adversarial Image Synthesis, with Application to
Classification from Very Small Sample [6.243995448840211]
We present a novel non-adversarial generative method - Clustered Optimization of LAtent space (COLA)
In the full data regime, our method is capable of generating diverse multi-class images with no supervision.
In the small-data regime, where only a small sample of labeled images is available for training with no access to additional unlabeled data, our results surpass state-of-the-art GAN models trained on the same amount of data.
arXiv Detail & Related papers (2020-11-25T18:47:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.