StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation
- URL: http://arxiv.org/abs/2202.12362v1
- Date: Thu, 24 Feb 2022 21:03:51 GMT
- Title: StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Translation
- Authors: Peter Schaldenbrand, Zhixuan Liu, Jean Oh
- Abstract summary: We present an approach for generating styled drawings for a given text description where a user can specify a desired drawing style.
Inspired by a theory in art that style and content are generally inseparable during the creative process, we propose a coupled approach, known here as StyleCLIPDraw.
Based on human evaluation, the styles of images generated by StyleCLIPDraw are strongly preferred to those by the sequential approach.
- Score: 10.357474047610172
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generating images that fit a given text description using machine learning
has improved greatly with the release of technologies such as the CLIP
image-text encoder model; however, current methods lack artistic control of the
style of image to be generated. We present an approach for generating styled
drawings for a given text description where a user can specify a desired
drawing style using a sample image. Inspired by a theory in art that style and
content are generally inseparable during the creative process, we propose a
coupled approach, known here as StyleCLIPDraw, whereby the drawing is generated
by optimizing for style and content simultaneously throughout the process as
opposed to applying style transfer after creating content in a sequence. Based
on human evaluation, the styles of images generated by StyleCLIPDraw are
strongly preferred to those by the sequential approach. Although the quality of
content generation degrades for certain styles, overall considering both
content \textit{and} style, StyleCLIPDraw is found far more preferred,
indicating the importance of style, look, and feel of machine generated images
to people as well as indicating that style is coupled in the drawing process
itself. Our code (https://github.com/pschaldenbrand/StyleCLIPDraw), a
demonstration (https://replicate.com/pschaldenbrand/style-clip-draw), and style
evaluation data
(https://www.kaggle.com/pittsburghskeet/drawings-with-style-evaluation-styleclipdraw)
are publicly available.
Related papers
- StyleBrush: Style Extraction and Transfer from a Single Image [19.652575295703485]
Stylization for visual content aims to add specific style patterns at the pixel level while preserving the original structural features.
We propose StyleBrush, a method that accurately captures styles from a reference image and brushes'' the extracted style onto other input visual content.
arXiv Detail & Related papers (2024-08-18T14:27:20Z) - StyleShot: A Snapshot on Any Style [20.41380860802149]
We show that, a good style representation is crucial and sufficient for generalized style transfer without test-time tuning.
We achieve this through constructing a style-aware encoder and a well-organized style dataset called StyleGallery.
We highlight that, our approach, named StyleShot, is simple yet effective in mimicking various desired styles, without test-time tuning.
arXiv Detail & Related papers (2024-07-01T16:05:18Z) - Style Aligned Image Generation via Shared Attention [61.121465570763085]
We introduce StyleAligned, a technique designed to establish style alignment among a series of generated images.
By employing minimal attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models.
Our method's evaluation across diverse styles and text prompts demonstrates high-quality and fidelity.
arXiv Detail & Related papers (2023-12-04T18:55:35Z) - StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter [78.75422651890776]
StyleCrafter is a generic method that enhances pre-trained T2V models with a style control adapter.
To promote content-style disentanglement, we remove style descriptions from the text prompt and extract style information solely from the reference image.
StyleCrafter efficiently generates high-quality stylized videos that align with the content of the texts and resemble the style of the reference images.
arXiv Detail & Related papers (2023-12-01T03:53:21Z) - MOSAIC: Multi-Object Segmented Arbitrary Stylization Using CLIP [0.0]
Style transfer driven by text prompts paved a new path for creatively stylizing the images without collecting an actual style image.
We propose a new method Multi-Object Segmented Arbitrary Stylization Using CLIP (MOSAIC) that can apply styles to different objects in the image based on the context extracted from the input prompt.
Our method can extend to any arbitrary objects, styles and produce high-quality images compared to the current state of art methods.
arXiv Detail & Related papers (2023-09-24T18:24:55Z) - StyleAdapter: A Unified Stylized Image Generation Model [97.24936247688824]
StyleAdapter is a unified stylized image generation model capable of producing a variety of stylized images.
It can be integrated with existing controllable synthesis methods, such as T2I-adapter and ControlNet.
arXiv Detail & Related papers (2023-09-04T19:16:46Z) - Visual Captioning at Will: Describing Images and Videos Guided by a Few
Stylized Sentences [49.66987347397398]
Few-Shot Stylized Visual Captioning aims to generate captions in any desired style, using only a few examples as guidance during inference.
We propose a framework called FS-StyleCap for this task, which utilizes a conditional encoder-decoder language model and a visual projection module.
arXiv Detail & Related papers (2023-07-31T04:26:01Z) - Any-to-Any Style Transfer: Making Picasso and Da Vinci Collaborate [58.83278629019384]
Style transfer aims to render the style of a given image for style reference to another given image for content reference.
Existing approaches either apply the holistic style of the style image in a global manner, or migrate local colors and textures of the style image to the content counterparts in a pre-defined way.
We propose Any-to-Any Style Transfer, which enables users to interactively select styles of regions in the style image and apply them to the prescribed content regions.
arXiv Detail & Related papers (2023-04-19T15:15:36Z) - Domain Enhanced Arbitrary Image Style Transfer via Contrastive Learning [84.8813842101747]
Contrastive Arbitrary Style Transfer (CAST) is a new style representation learning and style transfer method via contrastive learning.
Our framework consists of three key components, i.e., a multi-layer style projector for style code encoding, a domain enhancement module for effective learning of style distribution, and a generative network for image style transfer.
arXiv Detail & Related papers (2022-05-19T13:11:24Z) - StyleCLIPDraw: Coupling Content and Style in Text-to-Drawing Synthesis [9.617654472780874]
StyleCLIPDraw adds a style loss to the CLIPDraw text-to-drawing synthesis model.
Our proposed approach is able to capture a style in both texture and shape.
arXiv Detail & Related papers (2021-11-04T19:57:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.