Related papers: Handwriting Transformers

Handwriting Transformers

URL: http://arxiv.org/abs/2104.03964v1
Date: Thu, 8 Apr 2021 17:59:43 GMT
Title: Handwriting Transformers
Authors: Ankan Kumar Bhunia, Salman Khan, Hisham Cholakkal, Rao Muhammad Anwer, Fahad Shahbaz Khan, Mubarak Shah
Abstract summary: We propose a transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement and global and local writing style patterns. The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism. Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated.
Score: 98.3964093654716
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a novel transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement as well as global and local writing style patterns. The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism, thereby encoding both global and local style patterns. Further, the proposed transformer-based HWT comprises an encoder-decoder attention that enables style-content entanglement by gathering the style representation of each query character. To the best of our knowledge, we are the first to introduce a transformer-based generative network for styled handwritten text generation. Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated through extensive qualitative, quantitative and human-based evaluations. The proposed HWT can handle arbitrary length of text and any desired writing style in a few-shot setting. Further, our HWT generalizes well to the challenging scenario where both words and writing style are unseen during training, generating realistic styled handwritten text images.

Related papers

Zero-Shot Styled Text Image Generation, but Make It Autoregressive [34.09957000751439]
Styled Handwritten Text Generation (HTG) has recently received attention from the computer vision and document analysis communities. We propose a novel framework for text image generation, dubbed Emuru. Our approach leverages a powerful text image representation model (a variational autoencoder) combined with an autoregressive Transformer.
arXiv Detail & Related papers (2025-03-21T11:56:20Z)
Challenging Assumptions in Learning Generic Text Style Embeddings [24.64611983641699]
This study addresses the gap by creating generic, sentence-level style embeddings crucial for style-centric tasks. Our approach is grounded on the premise that low-level text style changes can compose any high-level style.
arXiv Detail & Related papers (2025-01-27T14:21:34Z)
Bringing Characters to New Stories: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting [71.29100512700064]
We present T-Prompter, a training-free method for theme-specific image generation. T-Prompter integrates reference images into generative models, allowing users to seamlessly specify the target theme. Our approach enables consistent story generation, character design, realistic character generation, and style-guided image generation.
arXiv Detail & Related papers (2025-01-26T19:01:19Z)
Towards Visual Text Design Transfer Across Languages [49.78504488452978]
We introduce a novel task of Multimodal Style Translation (MuST-Bench) MuST-Bench is a benchmark designed to evaluate the ability of visual text generation models to perform translation across different writing systems. In response, we introduce SIGIL, a framework for multimodal style translation that eliminates the need for style descriptions.
arXiv Detail & Related papers (2024-10-24T15:15:01Z)
Beyond Color and Lines: Zero-Shot Style-Specific Image Variations with Coordinated Semantics [3.9717825324709413]
Style has been primarily considered in terms of artistic elements such as colors, brushstrokes, and lighting. In this study, we propose a zero-shot scheme for image variation with coordinated semantics.
arXiv Detail & Related papers (2024-10-24T08:34:57Z)
Style Aligned Image Generation via Shared Attention [61.121465570763085]
We introduce StyleAligned, a technique designed to establish style alignment among a series of generated images. By employing minimal attention sharing' during the diffusion process, our method maintains style consistency across images within T2I models. Our method's evaluation across diverse styles and text prompts demonstrates high-quality and fidelity.
arXiv Detail & Related papers (2023-12-04T18:55:35Z)
StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter [78.75422651890776]
StyleCrafter is a generic method that enhances pre-trained T2V models with a style control adapter. To promote content-style disentanglement, we remove style descriptions from the text prompt and extract style information solely from the reference image. StyleCrafter efficiently generates high-quality stylized videos that align with the content of the texts and resemble the style of the reference images.
arXiv Detail & Related papers (2023-12-01T03:53:21Z)
StyleAdapter: A Unified Stylized Image Generation Model [97.24936247688824]
StyleAdapter is a unified stylized image generation model capable of producing a variety of stylized images. It can be integrated with existing controllable synthesis methods, such as T2I-adapter and ControlNet.
arXiv Detail & Related papers (2023-09-04T19:16:46Z)
ParaGuide: Guided Diffusion Paraphrasers for Plug-and-Play Textual Style Transfer [57.6482608202409]
Textual style transfer is the task of transforming stylistic properties of text while preserving meaning. We introduce a novel diffusion-based framework for general-purpose style transfer that can be flexibly adapted to arbitrary target styles. We validate the method on the Enron Email Corpus, with both human and automatic evaluations, and find that it outperforms strong baselines on formality, sentiment, and even authorship style transfer.
arXiv Detail & Related papers (2023-08-29T17:36:02Z)
Handwritten Text Generation from Visual Archetypes [25.951540903019467]
We devise a Transformer-based model for Few-Shot styled handwritten text generation. We obtain a robust representation of unseen writers' calligraphy by exploiting specific pre-training on a large synthetic dataset.
arXiv Detail & Related papers (2023-03-27T14:58:20Z)
SLOGAN: Handwriting Style Synthesis for Arbitrary-Length and Out-of-Vocabulary Text [35.83345711291558]
We propose a novel method that can synthesize parameterized and controllable handwriting Styles for arbitrary-Length and Out-of-vocabulary text. We embed the text content by providing an easily obtainable printed style image, so that the diversity of the content can be flexibly achieved. Our method can synthesize words that are not included in the training vocabulary and with various new styles.
arXiv Detail & Related papers (2022-02-23T12:13:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.