StyleAugment: Learning Texture De-biased Representations by Style
Augmentation without Pre-defined Textures
- URL: http://arxiv.org/abs/2108.10549v1
- Date: Tue, 24 Aug 2021 07:17:02 GMT
- Title: StyleAugment: Learning Texture De-biased Representations by Style
Augmentation without Pre-defined Textures
- Authors: Sanghyuk Chun, Song Park
- Abstract summary: Recently powerful vision classifiers are biased towards textures, while shape information is overlooked by the models.
A simple attempt by augmenting training images using the artistic style transfer method, called Stylized ImageNet, can reduce the texture bias.
However, Stylized ImageNet approach has two drawbacks in fidelity and diversity.
We propose a StyleAugment by augmenting styles from the mini-batch.
- Score: 7.81768535871051
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent powerful vision classifiers are biased towards textures, while shape
information is overlooked by the models. A simple attempt by augmenting
training images using the artistic style transfer method, called Stylized
ImageNet, can reduce the texture bias. However, Stylized ImageNet approach has
two drawbacks in fidelity and diversity. First, the generated images show low
image quality due to the significant semantic gap betweeen natural images and
artistic paintings. Also, Stylized ImageNet training samples are pre-computed
before training, resulting in showing the lack of diversity for each sample. We
propose a StyleAugment by augmenting styles from the mini-batch. StyleAugment
does not rely on the pre-defined style references, but generates augmented
images on-the-fly by natural images in the mini-batch for the references.
Hence, StyleAugment let the model observe abundant confounding cues for each
image by on-the-fly the augmentation strategy, while the augmented images are
more realistic than artistic style transferred images. We validate the
effectiveness of StyleAugment in the ImageNet dataset with robustness
benchmarks, such as texture de-biased accuracy, corruption robustness, natural
adversarial samples, and occlusion robustness. StyleAugment shows better
generalization performances than previous unsupervised de-biasing methods and
state-of-the-art data augmentation methods in our experiments.
Related papers
- ZePo: Zero-Shot Portrait Stylization with Faster Sampling [61.14140480095604]
This paper presents an inversion-free portrait stylization framework based on diffusion models that accomplishes content and style feature fusion in merely four sampling steps.
We propose a feature merging strategy to amalgamate redundant features in Consistency Features, thereby reducing the computational load of attention control.
arXiv Detail & Related papers (2024-08-10T08:53:41Z) - MuseumMaker: Continual Style Customization without Catastrophic Forgetting [50.12727620780213]
We propose MuseumMaker, a method that enables the synthesis of images by following a set of customized styles in a never-end manner.
When facing with a new customization style, we develop a style distillation loss module to extract and learn the styles of the training data for new image generation.
It can minimize the learning biases caused by content of new training images, and address the catastrophic overfitting issue induced by few-shot images.
arXiv Detail & Related papers (2024-04-25T13:51:38Z) - Direct Consistency Optimization for Compositional Text-to-Image
Personalization [73.94505688626651]
Text-to-image (T2I) diffusion models, when fine-tuned on a few personal images, are able to generate visuals with a high degree of consistency.
We propose to fine-tune the T2I model by maximizing consistency to reference images, while penalizing the deviation from the pretrained model.
arXiv Detail & Related papers (2024-02-19T09:52:41Z) - InstaStyle: Inversion Noise of a Stylized Image is Secretly a Style Adviser [19.466860144772674]
In this paper, we propose InstaStyle, a novel approach that excels in generating high-fidelity stylized images with only a single reference image.
Our approach is based on the finding that the inversion noise from a stylized reference image inherently carries the style signal.
We introduce a learnable style token via prompt refinement, which enhances the accuracy of the style description for the reference image.
arXiv Detail & Related papers (2023-11-25T14:38:54Z) - ControlStyle: Text-Driven Stylized Image Generation Using Diffusion
Priors [105.37795139586075]
We propose a new task for stylizing'' text-to-image models, namely text-driven stylized image generation.
We present a new diffusion model (ControlStyle) via upgrading a pre-trained text-to-image model with a trainable modulation network.
Experiments demonstrate the effectiveness of our ControlStyle in producing more visually pleasing and artistic results.
arXiv Detail & Related papers (2023-11-09T15:50:52Z) - WSAM: Visual Explanations from Style Augmentation as Adversarial
Attacker and Their Influence in Image Classification [2.282270386262498]
This paper outlines a style augmentation algorithm using noise-based sampling with addition to improving randomization on a general linear transformation for style transfer.
All models not only present incredible robustness against image stylizing but also outperform all previous methods and surpass the state-of-the-art performance for the STL-10 dataset.
arXiv Detail & Related papers (2023-08-29T02:50:36Z) - DIFF-NST: Diffusion Interleaving For deFormable Neural Style Transfer [27.39248034592382]
We propose using a new class of models to perform style transfer while enabling deformable style transfer.
We show how leveraging the priors of these models can expose new artistic controls at inference time.
arXiv Detail & Related papers (2023-07-09T12:13:43Z) - Cap2Aug: Caption guided Image to Image data Augmentation [41.53127698828463]
Cap2Aug is an image-to-image diffusion model-based data augmentation strategy using image captions as text prompts.
We generate captions from the limited training images and using these captions edit the training images using an image-to-image stable diffusion model.
This strategy generates augmented versions of images similar to the training images yet provides semantic diversity across the samples.
arXiv Detail & Related papers (2022-12-11T04:37:43Z) - Learning Diverse Tone Styles for Image Retouching [73.60013618215328]
We propose to learn diverse image retouching with normalizing flow-based architectures.
A joint-training pipeline is composed of a style encoder, a conditional RetouchNet, and the image tone style normalizing flow (TSFlow) module.
Our proposed method performs favorably against state-of-the-art methods and is effective in generating diverse results.
arXiv Detail & Related papers (2022-07-12T09:49:21Z) - Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning.
Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z) - P$^2$-GAN: Efficient Style Transfer Using Single Style Image [2.703193151632043]
Style transfer is a useful image synthesis technique that can re-render given image into another artistic style.
Generative Adversarial Network (GAN) is a widely adopted framework toward this task for its better representation ability on local style patterns.
In this paper, a novel Patch Permutation GAN (P$2$-GAN) network that can efficiently learn the stroke style from a single style image is proposed.
arXiv Detail & Related papers (2020-01-21T12:08:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.