PromptStyler: Prompt-driven Style Generation for Source-free Domain
Generalization
- URL: http://arxiv.org/abs/2307.15199v2
- Date: Tue, 15 Aug 2023 08:30:45 GMT
- Title: PromptStyler: Prompt-driven Style Generation for Source-free Domain
Generalization
- Authors: Junhyeong Cho, Gilhyun Nam, Sungyeon Kim, Hunmin Yang, Suha Kwak
- Abstract summary: We propose PromptStyler which simulates various distribution shifts in the joint space by synthesizing diverse styles via prompts.
The proposed method learns to generate a variety of style features via learnable style word vectors for pseudo-words S*.
PromptStyler achieves the state of the art on PACS, VLCS, OfficeHome and DomainNet, even though it does not require any images for training.
- Score: 35.37285674554127
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In a joint vision-language space, a text feature (e.g., from "a photo of a
dog") could effectively represent its relevant image features (e.g., from dog
photos). Also, a recent study has demonstrated the cross-modal transferability
phenomenon of this joint space. From these observations, we propose
PromptStyler which simulates various distribution shifts in the joint space by
synthesizing diverse styles via prompts without using any images to deal with
source-free domain generalization. The proposed method learns to generate a
variety of style features (from "a S* style of a") via learnable style word
vectors for pseudo-words S*. To ensure that learned styles do not distort
content information, we force style-content features (from "a S* style of a
[class]") to be located nearby their corresponding content features (from
"[class]") in the joint vision-language space. After learning style word
vectors, we train a linear classifier using synthesized style-content features.
PromptStyler achieves the state of the art on PACS, VLCS, OfficeHome and
DomainNet, even though it does not require any images for training.
Related papers
- Visual Captioning at Will: Describing Images and Videos Guided by a Few
Stylized Sentences [49.66987347397398]
Few-Shot Stylized Visual Captioning aims to generate captions in any desired style, using only a few examples as guidance during inference.
We propose a framework called FS-StyleCap for this task, which utilizes a conditional encoder-decoder language model and a visual projection module.
arXiv Detail & Related papers (2023-07-31T04:26:01Z) - Sem-CS: Semantic CLIPStyler for Text-Based Image Style Transfer [4.588028371034406]
We propose Semantic CLIPStyler (Sem-CS) that performs semantic style transfer.
Sem-CS first segments the content image into salient and non-salient objects and then transfers artistic style based on a given style text description.
Our empirical results, including DISTS, NIMA and user study scores, show that our proposed framework yields superior qualitative and quantitative performance.
arXiv Detail & Related papers (2023-07-12T05:59:42Z) - Any-to-Any Style Transfer: Making Picasso and Da Vinci Collaborate [58.83278629019384]
Style transfer aims to render the style of a given image for style reference to another given image for content reference.
Existing approaches either apply the holistic style of the style image in a global manner, or migrate local colors and textures of the style image to the content counterparts in a pre-defined way.
We propose Any-to-Any Style Transfer, which enables users to interactively select styles of regions in the style image and apply them to the prescribed content regions.
arXiv Detail & Related papers (2023-04-19T15:15:36Z) - StylerDALLE: Language-Guided Style Transfer Using a Vector-Quantized
Tokenizer of a Large-Scale Generative Model [64.26721402514957]
We propose StylerDALLE, a style transfer method that uses natural language to describe abstract art styles.
Specifically, we formulate the language-guided style transfer task as a non-autoregressive token sequence translation.
To incorporate style information, we propose a Reinforcement Learning strategy with CLIP-based language supervision.
arXiv Detail & Related papers (2023-03-16T12:44:44Z) - SEM-CS: Semantic CLIPStyler for Text-Based Image Style Transfer [4.588028371034406]
We propose Semantic CLIPStyler (Sem-CS) that performs semantic style transfer.
Sem-CS first segments the content image into salient and non-salient objects and then transfers artistic style based on a given style text description.
Our empirical results, including DISTS, NIMA and user study scores, show that our proposed framework yields superior qualitative and quantitative performance.
arXiv Detail & Related papers (2023-03-11T07:33:06Z) - Few-shot Font Generation by Learning Style Difference and Similarity [84.76381937516356]
We propose a novel font generation approach by learning the Difference between different styles and the Similarity of the same style (DS-Font)
Specifically, we propose a multi-layer style projector for style encoding and realize a distinctive style representation via our proposed Cluster-level Contrastive Style (CCS) loss.
arXiv Detail & Related papers (2023-01-24T13:57:25Z) - Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning.
Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z) - STALP: Style Transfer with Auxiliary Limited Pairing [36.23393954839379]
We present an approach to example-based stylization of images that uses a single pair of a source image and its stylized counterpart.
We demonstrate how to train an image translation network that can perform real-time semantically meaningful style transfer to a set of target images.
arXiv Detail & Related papers (2021-10-20T11:38:41Z) - Language-Driven Image Style Transfer [72.36790598245096]
We introduce a new task -- language-driven image style transfer (textttLDIST) -- to manipulate the style of a content image, guided by a text.
The discriminator considers the correlation between language and patches of style images or transferred results to jointly embed style instructions.
Experiments show that our CLVA is effective and achieves superb transferred results on textttLDIST.
arXiv Detail & Related papers (2021-06-01T01:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.