GenText: Unsupervised Artistic Text Generation via Decoupled Font and
Texture Manipulation
- URL: http://arxiv.org/abs/2207.09649v1
- Date: Wed, 20 Jul 2022 04:42:47 GMT
- Title: GenText: Unsupervised Artistic Text Generation via Decoupled Font and
Texture Manipulation
- Authors: Qirui Huang, Bin Fu, Aozhong zhang, Yu Qiao
- Abstract summary: We propose a novel approach, namely GenText, to achieve general artistic text style transfer.
Specifically, our work incorporates three different stages, stylization, destylization, and font transfer.
Considering the difficult data acquisition of paired artistic text images, our model is designed under the unsupervised setting.
- Score: 30.654807125764965
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automatic artistic text generation is an emerging topic which receives
increasing attention due to its wide applications. The artistic text can be
divided into three components, content, font, and texture, respectively.
Existing artistic text generation models usually focus on manipulating one
aspect of the above components, which is a sub-optimal solution for
controllable general artistic text generation. To remedy this issue, we propose
a novel approach, namely GenText, to achieve general artistic text style
transfer by separably migrating the font and texture styles from the different
source images to the target images in an unsupervised manner. Specifically, our
current work incorporates three different stages, stylization, destylization,
and font transfer, respectively, into a unified platform with a single powerful
encoder network and two separate style generator networks, one for font
transfer, the other for stylization and destylization. The destylization stage
first extracts the font style of the font reference image, then the font
transfer stage generates the target content with the desired font style.
Finally, the stylization stage renders the resulted font image with respect to
the texture style in the reference image. Moreover, considering the difficult
data acquisition of paired artistic text images, our model is designed under
the unsupervised setting, where all stages can be effectively optimized from
unpaired data. Qualitative and quantitative results are performed on artistic
text benchmarks, which demonstrate the superior performance of our proposed
model. The code with models will become publicly available in the future.
Related papers
- VitaGlyph: Vitalizing Artistic Typography with Flexible Dual-branch Diffusion Models [53.59400446543756]
We introduce a dual-branch and training-free method, namely VitaGlyph, to enable flexible artistic typography.
VitaGlyph treats input character as a scene composed of Subject and Surrounding, followed by rendering them under varying degrees of geometry transformation.
Experimental results demonstrate that VitaGlyph not only achieves better artistry and readability, but also manages to depict multiple customize concepts.
arXiv Detail & Related papers (2024-10-02T16:48:47Z) - FontStudio: Shape-Adaptive Diffusion Model for Coherent and Consistent Font Effect Generation [38.730628018627975]
This research aims to tackle the generation of text effects for multilingual fonts.
We introduce a novel shape-adaptive diffusion model capable of interpreting the given shape.
We also present a training-free, shape-adaptive effect transfer method for transferring textures from a generated reference letter to others.
arXiv Detail & Related papers (2024-06-12T16:43:47Z) - ControlStyle: Text-Driven Stylized Image Generation Using Diffusion
Priors [105.37795139586075]
We propose a new task for stylizing'' text-to-image models, namely text-driven stylized image generation.
We present a new diffusion model (ControlStyle) via upgrading a pre-trained text-to-image model with a trainable modulation network.
Experiments demonstrate the effectiveness of our ControlStyle in producing more visually pleasing and artistic results.
arXiv Detail & Related papers (2023-11-09T15:50:52Z) - TextPainter: Multimodal Text Image Generation with Visual-harmony and
Text-comprehension for Poster Design [50.8682912032406]
This study introduces TextPainter, a novel multimodal approach to generate text images.
TextPainter takes the global-local background image as a hint of style and guides the text image generation with visual harmony.
We construct the PosterT80K dataset, consisting of about 80K posters annotated with sentence-level bounding boxes and text contents.
arXiv Detail & Related papers (2023-08-09T06:59:29Z) - FASTER: A Font-Agnostic Scene Text Editing and Rendering Framework [19.564048493848272]
Scene Text Editing (STE) is a challenging research problem, that primarily aims towards modifying existing texts in an image.
Existing style-transfer-based approaches have shown sub-par editing performance due to complex image backgrounds, diverse font attributes, and varying word lengths within the text.
We propose a novel font-agnostic scene text editing and rendering framework, named FASTER, for simultaneously generating text in arbitrary styles and locations.
arXiv Detail & Related papers (2023-08-05T15:54:06Z) - GlyphDiffusion: Text Generation as Image Generation [100.98428068214736]
We propose GlyphDiffusion, a novel diffusion approach for text generation via text-guided image generation.
Our key idea is to render the target text as a glyph image containing visual language content.
Our model also makes significant improvements compared to the recent diffusion model.
arXiv Detail & Related papers (2023-04-25T02:14:44Z) - Improving Diffusion Models for Scene Text Editing with Dual Encoders [44.12999932588205]
Scene text editing is a challenging task that involves modifying or inserting specified texts in an image.
Recent advances in diffusion models have shown promise in overcoming these limitations with text-conditional image editing.
We propose DIFFSTE to improve pre-trained diffusion models with a dual encoder design.
arXiv Detail & Related papers (2023-04-12T02:08:34Z) - Make-A-Scene: Scene-Based Text-to-Image Generation with Human Priors [58.71128866226768]
Recent text-to-image generation methods have incrementally improved the generated image fidelity and text relevancy.
We propose a novel text-to-image method that addresses these gaps by (i) enabling a simple control mechanism complementary to text in the form of a scene.
Our model achieves state-of-the-art FID and human evaluation results, unlocking the ability to generate high fidelity images in a resolution of 512x512 pixels.
arXiv Detail & Related papers (2022-03-24T15:44:50Z) - TextStyleBrush: Transfer of Text Aesthetics from a Single Example [16.29689649632619]
We present a novel approach for disentangling the content of a text image from all aspects of its appearance.
We learn this disentanglement in a self-supervised manner.
We show results in different text domains which were previously handled by specialized methods.
arXiv Detail & Related papers (2021-06-15T19:28:49Z) - TediGAN: Text-Guided Diverse Face Image Generation and Manipulation [52.83401421019309]
TediGAN is a framework for multi-modal image generation and manipulation with textual descriptions.
StyleGAN inversion module maps real images to the latent space of a well-trained StyleGAN.
visual-linguistic similarity learns the text-image matching by mapping the image and text into a common embedding space.
instance-level optimization is for identity preservation in manipulation.
arXiv Detail & Related papers (2020-12-06T16:20:19Z) - Exploring Font-independent Features for Scene Text Recognition [22.34023249700896]
Scene text recognition (STR) has been extensively studied in last few years.
Many recently-proposed methods are specially designed to accommodate the arbitrary shape, layout and orientation of scene texts.
These methods, where font features and content features of characters are tangled, perform poorly in text recognition on scene images with texts in novel font styles.
arXiv Detail & Related papers (2020-09-16T03:36:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.