Word-As-Image for Semantic Typography
- URL: http://arxiv.org/abs/2303.01818v2
- Date: Mon, 6 Mar 2023 16:34:15 GMT
- Title: Word-As-Image for Semantic Typography
- Authors: Shir Iluz, Yael Vinker, Amir Hertz, Daniel Berio, Daniel Cohen-Or,
Ariel Shamir
- Abstract summary: A word-as-image is a semantic typography technique where a word illustration presents a visualization of the meaning of the word.
We present a method to create word-as-image illustrations automatically.
- Score: 41.380457098839926
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: A word-as-image is a semantic typography technique where a word illustration
presents a visualization of the meaning of the word, while also preserving its
readability. We present a method to create word-as-image illustrations
automatically. This task is highly challenging as it requires semantic
understanding of the word and a creative idea of where and how to depict these
semantics in a visually pleasing and legible manner. We rely on the remarkable
ability of recent large pretrained language-vision models to distill textual
concepts visually. We target simple, concise, black-and-white designs that
convey the semantics clearly. We deliberately do not change the color or
texture of the letters and do not use embellishments. Our method optimizes the
outline of each letter to convey the desired concept, guided by a pretrained
Stable Diffusion model. We incorporate additional loss terms to ensure the
legibility of the text and the preservation of the style of the font. We show
high quality and engaging results on numerous examples and compare to
alternative techniques.
Related papers
- Beyond Color and Lines: Zero-Shot Style-Specific Image Variations with Coordinated Semantics [3.9717825324709413]
Style has been primarily considered in terms of artistic elements such as colors, brushstrokes, and lighting.
In this study, we propose a zero-shot scheme for image variation with coordinated semantics.
arXiv Detail & Related papers (2024-10-24T08:34:57Z) - Compositional Entailment Learning for Hyperbolic Vision-Language Models [54.41927525264365]
We show how to fully leverage the innate hierarchical nature of hyperbolic embeddings by looking beyond individual image-text pairs.
We propose Compositional Entailment Learning for hyperbolic vision-language models.
Empirical evaluation on a hyperbolic vision-language model trained with millions of image-text pairs shows that the proposed compositional learning approach outperforms conventional Euclidean CLIP learning.
arXiv Detail & Related papers (2024-10-09T14:12:50Z) - Text Guided Image Editing with Automatic Concept Locating and Forgetting [27.70615803908037]
We propose a novel method called Locate and Forget (LaF) to locate potential target concepts in the image for modification.
Compared to the baselines, our method demonstrates its superiority in text-guided image editing tasks both qualitatively and quantitatively.
arXiv Detail & Related papers (2024-05-30T05:36:32Z) - Hypernymy Understanding Evaluation of Text-to-Image Models via WordNet
Hierarchy [12.82992353036576]
We measure the capability of popular text-to-image models to understand $textithypernymy$, or the "is-a" relation between words.
We show how our metrics can provide a better understanding of the individual strengths and weaknesses of popular text-to-image models.
arXiv Detail & Related papers (2023-10-13T16:53:25Z) - Inversion-Based Style Transfer with Diffusion Models [78.93863016223858]
Previous arbitrary example-guided artistic image generation methods often fail to control shape changes or convey elements.
We propose an inversion-based style transfer method (InST), which can efficiently and accurately learn the key information of an image.
arXiv Detail & Related papers (2022-11-23T18:44:25Z) - Comprehending and Ordering Semantics for Image Captioning [124.48670699658649]
We propose a new recipe of Transformer-style structure, namely Comprehending and Ordering Semantics Networks (COS-Net)
COS-Net unifies an enriched semantic comprehending and a learnable semantic ordering processes into a single architecture.
arXiv Detail & Related papers (2022-06-14T15:51:14Z) - Toward a Visual Concept Vocabulary for GAN Latent Space [74.12447538049537]
This paper introduces a new method for building open-ended vocabularies of primitive visual concepts represented in a GAN's latent space.
Our approach is built from three components: automatic identification of perceptually salient directions based on their layer selectivity; human annotation of these directions with free-form, compositional natural language descriptions.
Experiments show that concepts learned with our approach are reliable and composable -- generalizing across classes, contexts, and observers.
arXiv Detail & Related papers (2021-10-08T17:58:19Z) - Paint by Word [32.05329583044764]
We investigate the problem of zero-shot semantic image painting.
Instead of painting modifications into an image using only concrete colors or a finite set of semantic concepts, we ask how to create semantic paint based on open full-text descriptions.
Our method combines a state-of-the art generative model of realistic images with a state-of-the-art text-image semantic similarity network.
arXiv Detail & Related papers (2021-03-19T17:59:08Z) - Accurate Word Representations with Universal Visual Guidance [55.71425503859685]
This paper proposes a visual representation method to explicitly enhance conventional word embedding with multiple-aspect senses from visual guidance.
We build a small-scale word-image dictionary from a multimodal seed dataset where each word corresponds to diverse related images.
Experiments on 12 natural language understanding and machine translation tasks further verify the effectiveness and the generalization capability of the proposed approach.
arXiv Detail & Related papers (2020-12-30T09:11:50Z) - GANwriting: Content-Conditioned Generation of Styled Handwritten Word
Images [10.183347908690504]
We take a step closer to producing realistic and varied artificially rendered handwritten words.
We propose a novel method that is able to produce credible handwritten word images by conditioning the generative process with both calligraphic style features and textual content.
arXiv Detail & Related papers (2020-03-05T12:37:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.