StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Synthesis
- URL: http://arxiv.org/abs/2111.03133v1
- Date: Thu, 4 Nov 2021 19:57:17 GMT
- Title: StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Synthesis
- Authors: Peter Schaldenbrand, Zhixuan Liu and Jean Oh
- Abstract summary: StyleCLIPDraw adds a style loss to the CLIPDraw text-to-drawing synthesis model.
Our proposed approach is able to capture a style in both texture and shape.
- Score: 9.617654472780874
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generating images that fit a given text description using machine learning
has improved greatly with the release of technologies such as the CLIP
image-text encoder model; however, current methods lack artistic control of the
style of image to be generated. We introduce StyleCLIPDraw which adds a style
loss to the CLIPDraw text-to-drawing synthesis model to allow artistic control
of the synthesized drawings in addition to control of the content via text.
Whereas performing decoupled style transfer on a generated image only affects
the texture, our proposed coupled approach is able to capture a style in both
texture and shape, suggesting that the style of the drawing is coupled with the
drawing process itself. More results and our code are available at
https://github.com/pschaldenbrand/StyleCLIPDraw
Related papers
- Zero-Painter: Training-Free Layout Control for Text-to-Image Synthesis [63.757624792753205]
We present Zero-Painter, a framework for layout-conditional text-to-image synthesis.
Our method utilizes object masks and individual descriptions, coupled with a global text prompt, to generate images with high fidelity.
arXiv Detail & Related papers (2024-06-06T13:02:00Z) - StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter [78.75422651890776]
StyleCrafter is a generic method that enhances pre-trained T2V models with a style control adapter.
To promote content-style disentanglement, we remove style descriptions from the text prompt and extract style information solely from the reference image.
StyleCrafter efficiently generates high-quality stylized videos that align with the content of the texts and resemble the style of the reference images.
arXiv Detail & Related papers (2023-12-01T03:53:21Z) - MOSAIC: Multi-Object Segmented Arbitrary Stylization Using CLIP [0.0]
Style transfer driven by text prompts paved a new path for creatively stylizing the images without collecting an actual style image.
We propose a new method Multi-Object Segmented Arbitrary Stylization Using CLIP (MOSAIC) that can apply styles to different objects in the image based on the context extracted from the input prompt.
Our method can extend to any arbitrary objects, styles and produce high-quality images compared to the current state of art methods.
arXiv Detail & Related papers (2023-09-24T18:24:55Z) - Visual Captioning at Will: Describing Images and Videos Guided by a Few
Stylized Sentences [49.66987347397398]
Few-Shot Stylized Visual Captioning aims to generate captions in any desired style, using only a few examples as guidance during inference.
We propose a framework called FS-StyleCap for this task, which utilizes a conditional encoder-decoder language model and a visual projection module.
arXiv Detail & Related papers (2023-07-31T04:26:01Z) - A Fast Text-Driven Approach for Generating Artistic Content [11.295288894403754]
We propose a complete framework that generates visual art.
We implement an improved version that can generate a wide range of results with varying degrees of detail, style and structure.
To further enhance the results, we insert an artistic super-resolution module in the generative pipeline.
arXiv Detail & Related papers (2022-06-22T14:34:59Z) - Interactive Style Transfer: All is Your Palette [74.06681967115594]
We propose a drawing-like interactive style transfer (IST) method, by which users can interactively create a harmonious-style image.
Our IST method can serve as a brush, dip style from anywhere, and then paint to any region of the target content image.
arXiv Detail & Related papers (2022-03-25T06:38:46Z) - APRNet: Attention-based Pixel-wise Rendering Network for Photo-Realistic
Text Image Generation [11.186226578337125]
Style-guided text image generation tries to synthesize text image by imitating reference image's appearance.
In this paper, we focus on transferring style image's background and foreground color patterns to the content image to generate photo-realistic text image.
arXiv Detail & Related papers (2022-03-15T07:48:34Z) - StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation [10.357474047610172]
We present an approach for generating styled drawings for a given text description where a user can specify a desired drawing style.
Inspired by a theory in art that style and content are generally inseparable during the creative process, we propose a coupled approach, known here as StyleCLIPDraw.
Based on human evaluation, the styles of images generated by StyleCLIPDraw are strongly preferred to those by the sequential approach.
arXiv Detail & Related papers (2022-02-24T21:03:51Z) - CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image
Encoders [0.7734726150561088]
CLIPDraw is an algorithm that synthesizes novel drawings based on natural language input.
It operates over vector strokes rather than pixel images, a constraint that biases drawings towards simpler human-recognizable shapes.
Results compare between CLIPDraw and other synthesis-through-optimization methods.
arXiv Detail & Related papers (2021-06-28T16:43:26Z) - Language-Driven Image Style Transfer [72.36790598245096]
We introduce a new task -- language-driven image style transfer (textttLDIST) -- to manipulate the style of a content image, guided by a text.
The discriminator considers the correlation between language and patches of style images or transferred results to jointly embed style instructions.
Experiments show that our CLVA is effective and achieves superb transferred results on textttLDIST.
arXiv Detail & Related papers (2021-06-01T01:58:50Z) - StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery [71.1862388442953]
We develop a text-based interface for StyleGAN image manipulation.
We first introduce an optimization scheme that utilizes a CLIP-based loss to modify an input latent vector in response to a user-provided text prompt.
Next, we describe a latent mapper that infers a text-guided latent manipulation step for a given input image, allowing faster and more stable text-based manipulation.
arXiv Detail & Related papers (2021-03-31T17:51:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.