Related papers: Stroke Modeling Enables Vectorized Character Generation with Large Vectorized Glyph Model

Stroke Modeling Enables Vectorized Character Generation with Large Vectorized Glyph Model

URL: http://arxiv.org/abs/2511.11119v1
Date: Fri, 14 Nov 2025 09:48:38 GMT
Title: Stroke Modeling Enables Vectorized Character Generation with Large Vectorized Glyph Model
Authors: Xinyue Zhang, Haolong Li, Jiawei Ma, Chen Ye,
Abstract summary: We propose a novel Large Vectorized Glyph Model (LVGM) designed to generate vectorized Chinese glyphs by predicting the next stroke.<n>With limited strokes given, it can generate complete characters, semantically elegant words, and even unseen verses in vectorized form.
Score: 20.240367070645963
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vectorized glyphs are widely used in poster design, network animation, art display, and various other fields due to their scalability and flexibility. In typography, they are often seen as special sequences composed of ordered strokes. This concept extends to the token sequence prediction abilities of large language models (LLMs), enabling vectorized character generation through stroke modeling. In this paper, we propose a novel Large Vectorized Glyph Model (LVGM) designed to generate vectorized Chinese glyphs by predicting the next stroke. Initially, we encode strokes into discrete latent variables called stroke embeddings. Subsequently, we train our LVGM via fine-tuning DeepSeek LLM by predicting the next stroke embedding. With limited strokes given, it can generate complete characters, semantically elegant words, and even unseen verses in vectorized form. Moreover, we release a new large-scale Chinese SVG dataset containing 907,267 samples based on strokes for dynamically vectorized glyph generation. Experimental results show that our model has scaling behaviors on data scales. Our generated vectorized glyphs have been validated by experts and relevant individuals.

Related papers

VecGlypher: Unified Vector Glyph Generation with Language Models [49.18215716168074]
VecGlypher generates high-fidelity vector glyphs directly from text descriptions or image exemplars.<n>VecGlypher autoregressively emits SVG path tokens, avoiding intermediates and a target character.
arXiv Detail & Related papers (2026-02-25T00:27:23Z)
SwiftSketch: A Diffusion Model for Image-to-Vector Sketch Generation [57.47730473674261]
We introduce SwiftSketch, a model for image-conditioned vector sketch generation that can produce high-quality sketches in less than a second.<n>SwiftSketch operates by progressively denoising stroke control points sampled from a Gaussian distribution.<n>ControlSketch is a method that enhances SDS-based techniques by incorporating precise spatial control through a depth-aware ControlNet.
arXiv Detail & Related papers (2025-02-12T18:57:12Z)
NeuralSVG: An Implicit Representation for Text-to-Vector Generation [54.4153300455889]
We propose NeuralSVG, an implicit neural representation for generating vector graphics from text prompts.<n>To encourage a layered structure in the generated SVG, we introduce a dropout-based regularization technique.<n>We demonstrate that NeuralSVG outperforms existing methods in generating structured and flexible SVG.
arXiv Detail & Related papers (2025-01-07T18:50:06Z)
Vector Grimoire: Codebook-based Shape Generation under Raster Image Supervision [20.325246638505714]
We introduce GRIMOIRE, a text-guided generative model that learns to map images onto a discrete codebook by reconstructing them as vector shapes. Unlike existing models that require direct supervision from data, GRIMOIRE learns using only image supervision which opens up vector generative modeling to significantly more data.
arXiv Detail & Related papers (2024-10-08T12:41:31Z)
StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis [112.25071764647683]
StrokeNUWA is a pioneering work exploring a better visual representation ''stroke tokens'' on vector graphics. equipped with stroke tokens, StrokeNUWA can significantly surpass traditional LLM-based and optimization-based methods. StrokeNUWA achieves up to a 94x speedup in inference over the speed of prior methods with an exceptional SVG code compression ratio of 6.9%.
arXiv Detail & Related papers (2024-01-30T15:20:26Z)
VISIT: Visualizing and Interpreting the Semantic Information Flow of Transformers [45.42482446288144]
Recent advances in interpretability suggest we can project weights and hidden states of transformer-based language models to their vocabulary. We investigate LM attention heads and memory values, the vectors the models dynamically create and recall while processing a given input. We create a tool to visualize a forward pass of Generative Pre-trained Transformers (GPTs) as an interactive flow graph.
arXiv Detail & Related papers (2023-05-22T19:04:56Z)
VecFontSDF: Learning to Reconstruct and Synthesize High-quality Vector Fonts via Signed Distance Functions [14.166708010969502]
This paper proposes an end-to-end trainable method, VecFontSDF, to reconstruct and synthesize high-quality vector fonts.<n>Based on the proposed SDF-based implicit shape representation, VecFontSDF learns to model each glyph as shape primitives enclosed by several parabolic curves.
arXiv Detail & Related papers (2023-03-22T16:14:39Z)
DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning [21.123297001902177]
We propose a novel method, DeepVecFont, to generate visually-pleasing vector glyphs. The highlights of this paper are threefold. First, we design a dual-modality learning strategy which utilizes both image-aspect and sequence-aspect features of fonts to synthesize vector glyphs. Second, we provide a new generative paradigm to handle unstructured data (e.g., vector glyphs) by randomly sampling plausible results to get the optimal one which is further refined under the guidance of generated structured data.
arXiv Detail & Related papers (2021-10-13T12:57:19Z)
Scalable Font Reconstruction with Dual Latent Manifolds [55.29525824849242]
We propose a deep generative model that performs typography analysis and font reconstruction. Our approach enables us to massively scale up the number of character types we can effectively model. We evaluate on the task of font reconstruction over various datasets representing character types of many languages.
arXiv Detail & Related papers (2021-09-10T20:37:43Z)
Cloud2Curve: Generation and Vectorization of Parametric Sketches [109.02932608241227]
We present Cloud2Curve, a generative model for scalable high-resolution vector sketches. We evaluate the generation and vectorization capabilities of our model on Quick, Draw! and KMNIST datasets.
arXiv Detail & Related papers (2021-03-29T12:09:42Z)
Word Shape Matters: Robust Machine Translation with Visual Embedding [78.96234298075389]
We introduce a new encoding of the input symbols for character-level NLP models. It encodes the shape of each character through the images depicting the letters when printed. We name this new strategy visual embedding and it is expected to improve the robustness of NLP models.
arXiv Detail & Related papers (2020-10-20T04:08:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.