DNA: Dual-branch Network with Adaptation for Open-Set Online Handwriting Generation
- URL: http://arxiv.org/abs/2511.22064v1
- Date: Thu, 27 Nov 2025 03:30:22 GMT
- Title: DNA: Dual-branch Network with Adaptation for Open-Set Online Handwriting Generation
- Authors: Tsai-Ling Huang, Nhat-Tuong Do-Tran, Ngoc-Hoang-Lam Le, Hong-Han Shuai, Ching-Chun Huang,
- Abstract summary: We introduce our method for online handwriting generation, where the writer's style and the characters generated during testing are unseen during training.<n>We propose a Dual-branch Network with Adaptation (DNA), which comprises an adaptive style branch and an adaptive content branch.<n>Our DNA model is well-suited for the unseen OHG setting, achieving state-of-the-art performance.
- Score: 28.985690380954765
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Online handwriting generation (OHG) enhances handwriting recognition models by synthesizing diverse, human-like samples. However, existing OHG methods struggle to generate unseen characters, particularly in glyph-based languages like Chinese, limiting their real-world applicability. In this paper, we introduce our method for OHG, where the writer's style and the characters generated during testing are unseen during training. To tackle this challenge, we propose a Dual-branch Network with Adaptation (DNA), which comprises an adaptive style branch and an adaptive content branch. The style branch learns stroke attributes such as writing direction, spacing, placement, and flow to generate realistic handwriting. Meanwhile, the content branch is designed to generalize effectively to unseen characters by decomposing character content into structural information and texture details, extracted via local and global encoders, respectively. Extensive experiments demonstrate that our DNA model is well-suited for the unseen OHG setting, achieving state-of-the-art performance.
Related papers
- DiffInk: Glyph- and Style-Aware Latent Diffusion Transformer for Text to Online Handwriting Generation [41.08176249345279]
DiffInk is the first latent diffusion Transformer framework for full-line handwriting generation.<n>We first introduce InkVAE, a novel sequential variational autoencoder enhanced with two complementary latent-space regularization losses.<n>We then introduce InkDiT, a novel latent diffusion Transformer that integrates target text and reference styles to generate coherent pen trajectories.
arXiv Detail & Related papers (2025-09-28T03:58:15Z) - Dual Orthogonal Guidance for Robust Diffusion-based Handwritten Text Generation [55.35931633405974]
Diffusion-based Handwritten Text Generation (HTG) approaches achieve impressive results frequent, in-vocabulary words observed at training time and on regular styles.<n>They are prone to memorizing training samples and often struggle with style variability and generation clarity.<n>We propose a novel sampling guidance strategy, Dual Orthogonal Guidance (DOG), that leverages a negatively perturbed prompt onto the original prompt.<n> Experimental results on the state-the-art DiffusionPen and One-DM demonstrate that DOG improves both content clarity and variability even for out-of-vocabulary words and challenging writing styles.
arXiv Detail & Related papers (2025-08-23T13:09:19Z) - Contrastive Masked Autoencoders for Character-Level Open-Set Writer Identification [25.996617568144675]
This paper introduces Contrastive Masked Auto-Encoders (CMAE) for Character-level Open-Set Writer Identification.<n>We merge Masked Auto-Encoders (MAE) with Contrastive Learning (CL) to simultaneously and respectively capture sequential information and distinguish diverse handwriting styles.<n>Our model achieves state-of-the-art results on the CASIA online handwriting dataset, reaching an impressive precision rate of 89.7%.
arXiv Detail & Related papers (2025-01-21T05:15:10Z) - Online Writer Retrieval with Chinese Handwritten Phrases: A Synergistic Temporal-Frequency Representation Learning Approach [53.189911918976655]
We propose DOLPHIN, a novel retrieval model designed to enhance handwriting representations through synergistic temporal-frequency analysis.<n>We introduce OLIWER, a large-scale online writer retrieval dataset encompassing over 670,000 Chinese handwritten phrases from 1,731 individuals.<n>Our findings emphasize the significance of point sampling frequency and pressure features in improving handwriting representation quality.
arXiv Detail & Related papers (2024-12-16T11:19:22Z) - Decoupling Layout from Glyph in Online Chinese Handwriting Generation [6.566541829858544]
We develop a text line layout generator and stylized font synthesizer.<n>The layout generator performs in-context-like learning based on the text content and the provided style references to generate positions for each glyph autoregressively.<n>The font synthesizer which consists of a character embedding dictionary, a multi-scale calligraphy style encoder, and a 1D U-Net based diffusion denoiser will generate each font on its position while imitating the calligraphy style extracted from the given style references.
arXiv Detail & Related papers (2024-10-03T08:46:17Z) - DiffusionPen: Towards Controlling the Style of Handwritten Text Generation [7.398476020996681]
DiffusionPen (DiffPen) is a 5-shot style handwritten text generation approach based on Latent Diffusion Models.
Our approach captures both textual and stylistic characteristics of seen and unseen words and styles, generating realistic handwritten samples.
Our method outperforms existing methods qualitatively and quantitatively, and its additional generated data can improve the performance of Handwriting Text Recognition (HTR) systems.
arXiv Detail & Related papers (2024-09-09T20:58:25Z) - Disentangling Writer and Character Styles for Handwriting Generation [8.33116145030684]
We present the style-disentangled Transformer (SDT), which employs two complementary contrastive objectives to extract the style commonalities of reference samples.
Our empirical findings reveal that the two learned style representations provide information at different frequency magnitudes.
arXiv Detail & Related papers (2023-03-26T14:32:02Z) - Boosting Modern and Historical Handwritten Text Recognition with
Deformable Convolutions [52.250269529057014]
Handwritten Text Recognition (HTR) in free-volution pages is a challenging image understanding task.
We propose to adopt deformable convolutions, which can deform depending on the input at hand and better adapt to the geometric variations of the text.
arXiv Detail & Related papers (2022-08-17T06:55:54Z) - Letter-level Online Writer Identification [86.13203975836556]
We focus on a novel problem, letter-level online writer-id, which requires only a few trajectories of written letters as identification cues.
A main challenge is that a person often writes a letter in different styles from time to time.
We refer to this problem as the variance of online writing styles (Var-O-Styles)
arXiv Detail & Related papers (2021-12-06T07:21:53Z) - Handwriting Transformers [98.3964093654716]
We propose a transformer-based styled handwritten text image generation approach, HWT, that strives to learn both style-content entanglement and global and local writing style patterns.
The proposed HWT captures the long and short range relationships within the style examples through a self-attention mechanism.
Our proposed HWT generates realistic styled handwritten text images and significantly outperforms the state-of-the-art demonstrated.
arXiv Detail & Related papers (2021-04-08T17:59:43Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.