Related papers: VecGlypher: Unified Vector Glyph Generation with Language Models

VecGlypher: Unified Vector Glyph Generation with Language Models

URL: http://arxiv.org/abs/2602.21461v1
Date: Wed, 25 Feb 2026 00:27:23 GMT
Title: VecGlypher: Unified Vector Glyph Generation with Language Models
Authors: Xiaoke Huang, Bhavul Gauri, Kam Woh Ng, Tony Ng, Mengmeng Xu, Zhiheng Liu, Weiming Ren, Zhaochong An, Zijian Zhou, Haonan Qiu, Yuyin Zhou, Sen He, Ziheng Wang, Tao Xiang, Xiao Han,
Abstract summary: VecGlypher generates high-fidelity vector glyphs directly from text descriptions or image exemplars.<n>VecGlypher autoregressively emits SVG path tokens, avoiding intermediates and a target character.
Score: 49.18215716168074
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Vector glyphs are the atomic units of digital typography, yet most learning-based pipelines still depend on carefully curated exemplar sheets and raster-to-vector postprocessing, which limits accessibility and editability. We introduce VecGlypher, a single multimodal language model that generates high-fidelity vector glyphs directly from text descriptions or image exemplars. Given a style prompt, optional reference glyph images, and a target character, VecGlypher autoregressively emits SVG path tokens, avoiding raster intermediates and producing editable, watertight outlines in one pass. A typography-aware data and training recipe makes this possible: (i) a large-scale continuation stage on 39K noisy Envato fonts to master SVG syntax and long-horizon geometry, followed by (ii) post-training on 2.5K expert-annotated Google Fonts with descriptive tags and exemplars to align language and imagery with geometry; preprocessing normalizes coordinate frames, canonicalizes paths, de-duplicates families, and quantizes coordinates for stable long-sequence decoding. On cross-family OOD evaluation, VecGlypher substantially outperforms both general-purpose LLMs and specialized vector-font baselines for text-only generation, while image-referenced generation reaches a state-of-the-art performance, with marked gains over DeepVecFont-v2 and DualVector. Ablations show that model scale and the two-stage recipe are critical and that absolute-coordinate serialization yields the best geometry. VecGlypher lowers the barrier to font creation by letting users design with words or exemplars, and provides a scalable foundation for future multimodal design tools.

Related papers

Stroke Modeling Enables Vectorized Character Generation with Large Vectorized Glyph Model [20.240367070645963]
We propose a novel Large Vectorized Glyph Model (LVGM) designed to generate vectorized Chinese glyphs by predicting the next stroke.<n>With limited strokes given, it can generate complete characters, semantically elegant words, and even unseen verses in vectorized form.
arXiv Detail & Related papers (2025-11-14T09:48:38Z)
See it. Say it. Sorted: Agentic System for Compositional Diagram Generation [0.5079602839359522]
We study sketch-to-diagram generation: converting rough hand sketches into precise, compositional diagrams.<n>We introduce See it. Say it. Sorted., a training-free agentic system that couples a Vision-Language Model (VLM) with Large Language Models (LLMs)<n>The system runs an iterative loop in which a Critic VLM proposes a small set of qualitative, edits; multiple candidate LLMs synthesize updates with diverse strategies.<n>This design prioritizes qualitative reasoning over brittle numerical estimates, preserves global constraints (e.g., alignment, connectivity), and naturally supports human-in-the-loop
arXiv Detail & Related papers (2025-08-21T04:20:36Z)
SVGen: Interpretable Vector Graphics Generation with Large Language Models [61.62816031675714]
We introduce SVG-1M, a large-scale dataset of high-quality SVGs paired with natural language descriptions.<n>We create well-aligned Text to SVG training pairs, including a subset with Chain of Thought annotations for enhanced semantic guidance.<n>Based on this dataset, we propose SVGen, an end-to-end model that generates SVG code from natural language inputs.
arXiv Detail & Related papers (2025-08-06T15:00:24Z)
NeuralSVG: An Implicit Representation for Text-to-Vector Generation [54.4153300455889]
We propose NeuralSVG, an implicit neural representation for generating vector graphics from text prompts.<n>To encourage a layered structure in the generated SVG, we introduce a dropout-based regularization technique.<n>We demonstrate that NeuralSVG outperforms existing methods in generating structured and flexible SVG.
arXiv Detail & Related papers (2025-01-07T18:50:06Z)
Visually Descriptive Language Model for Vector Graphics Reasoning [76.42082386029206]
We propose the Visually Descriptive Language Model (VDLM) to bridge the gap between low-level visual perception and high-level language reasoning.<n>We show that VDLM significantly improves state-of-the-art LMMs like GPT-4o on various multimodal perception and reasoning tasks.
arXiv Detail & Related papers (2024-04-09T17:30:18Z)
DualVector: Unsupervised Vector Font Synthesis with Dual-Part Representation [43.64428946288288]
Current font synthesis methods fail to represent the shape concisely or require vector supervision during training. We propose a novel dual-part representation for vector glyphs, where each glyph is modeled as a collection of closed "positive" and "negative" path pairs. Our method, named Dual-of-Font-art, outperforms state-of-the-art methods for practical use.
arXiv Detail & Related papers (2023-05-17T08:18:06Z)
DeepVecFont-v2: Exploiting Transformers to Synthesize Vector Fonts with Higher Quality [38.32966391626858]
This paper proposes an enhanced version of DeepVecFont for vector font synthesis. We adopt Transformers instead of RNNs to process sequential data and design a relaxation representation for vector outlines. We also propose to sample auxiliary points in addition to control points to precisely align the generated and target B'ezier curves or lines.
arXiv Detail & Related papers (2023-03-25T23:28:19Z)
Towards Layer-wise Image Vectorization [57.26058135389497]
We propose Layerwise Image Vectorization, namely LIVE, to convert images to SVGs and simultaneously maintain its image topology. Live generates compact forms with layer-wise structures that are semantically consistent with human perspective. Live initiates human editable SVGs for both designers and can be used in other applications.
arXiv Detail & Related papers (2022-06-09T17:55:02Z)
DeepVecFont: Synthesizing High-quality Vector Fonts via Dual-modality Learning [21.123297001902177]
We propose a novel method, DeepVecFont, to generate visually-pleasing vector glyphs. The highlights of this paper are threefold. First, we design a dual-modality learning strategy which utilizes both image-aspect and sequence-aspect features of fonts to synthesize vector glyphs. Second, we provide a new generative paradigm to handle unstructured data (e.g., vector glyphs) by randomly sampling plausible results to get the optimal one which is further refined under the guidance of generated structured data.
arXiv Detail & Related papers (2021-10-13T12:57:19Z)
Font Completion and Manipulation by Cycling Between Multi-Modality Representations [113.26243126754704]
We innovate to explore the generation of font glyphs as 2D graphic objects with the graph as an intermediate representation. We formulate a cross-modality cycled image-to-image structure with a graph between an image encoder and an image. Our model generates improved results than both image-to-image baseline and previous state-of-the-art methods for glyph completion.
arXiv Detail & Related papers (2021-08-30T02:43:29Z)
Rethinking Text Line Recognition Models [57.47147190119394]
We consider two decoder families (Connectionist Temporal Classification and Transformer) and three encoder modules (Bidirectional LSTMs, Self-Attention, and GRCLs) We compare their accuracy and performance on widely used public datasets of scene and handwritten text. Unlike the more common Transformer-based models, this architecture can handle inputs of arbitrary length.
arXiv Detail & Related papers (2021-04-15T21:43:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.