From Tokens to Numbers: Continuous Number Modeling for SVG Generation
- URL: http://arxiv.org/abs/2602.02820v1
- Date: Mon, 02 Feb 2026 21:20:38 GMT
- Title: From Tokens to Numbers: Continuous Number Modeling for SVG Generation
- Authors: Michael Ogezi, Martin Bell, Freda Shi, Ethan Smith,
- Abstract summary: Continuous Number Modeling (CNM) is an approach that directly models numbers as first-class, continuous values rather than discrete tokens.<n>Our formulation improves training speed by over 30% while maintaining higher fidelity compared to alternative approaches.
- Score: 17.597559308984042
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: For certain image generation tasks, vector graphics such as Scalable Vector Graphics (SVGs) offer clear benefits such as increased flexibility, size efficiency, and editing ease, but remain less explored than raster-based approaches. A core challenge is that the numerical, geometric parameters, which make up a large proportion of SVGs, are inefficiently encoded as long sequences of tokens. This slows training, reduces accuracy, and hurts generalization. To address these problems, we propose Continuous Number Modeling (CNM), an approach that directly models numbers as first-class, continuous values rather than discrete tokens. This formulation restores the mathematical elegance of the representation by aligning the model's inputs with the data's continuous nature, removing discretization artifacts introduced by token-based encoding. We then train a multimodal transformer on 2 million raster-to-SVG samples, followed by fine-tuning via reinforcement learning using perceptual feedback to further improve visual quality. Our approach improves training speed by over 30% while maintaining higher perceptual fidelity compared to alternative approaches. This work establishes CNM as a practical and efficient approach for high-quality vector generation, with potential for broader applications. We make our code available http://github.com/mikeogezi/CNM.
Related papers
- Autoregressive Image Generation with Masked Bit Modeling [34.36577356251466]
Bit AutoRegressive modeling (BAR) is a scalable framework that supports arbitrary codebook sizes.<n>BAR achieves a new state-of-the-art gFID of 0.99 on ImageNet-256, outperforming leading methods across both continuous and discrete paradigms.
arXiv Detail & Related papers (2026-02-09T18:59:58Z) - Continuous Autoregressive Language Models [56.49239051750678]
We introduce Continuous Autoregressive Language Models (CALM)<n>CALM uses a high-fidelity autoencoder to compress a chunk of K tokens into a single continuous vector.<n>We develop a comprehensive likelihood-free framework that enables robust training, evaluation, and controllable sampling.
arXiv Detail & Related papers (2025-10-31T17:58:11Z) - REAR: Rethinking Visual Autoregressive Models via Generator-Tokenizer Consistency Regularization [130.46612643194973]
reAR is a simple training strategy introducing a token-wise regularization objective.<n>On ImageNet, it reduces gFID from 3.02 to 1.86 and improves IS to 316.9 using a standardization-based tokenizer.<n>When applied to advanced tokenizers, it achieves a gFID of 1.42 with only 177M parameters, matching the performance with larger state-of-the-art diffusion models (675M)
arXiv Detail & Related papers (2025-10-06T02:48:13Z) - See it. Say it. Sorted: Agentic System for Compositional Diagram Generation [0.5079602839359522]
We study sketch-to-diagram generation: converting rough hand sketches into precise, compositional diagrams.<n>We introduce See it. Say it. Sorted., a training-free agentic system that couples a Vision-Language Model (VLM) with Large Language Models (LLMs)<n>The system runs an iterative loop in which a Critic VLM proposes a small set of qualitative, edits; multiple candidate LLMs synthesize updates with diverse strategies.<n>This design prioritizes qualitative reasoning over brittle numerical estimates, preserves global constraints (e.g., alignment, connectivity), and naturally supports human-in-the-loop
arXiv Detail & Related papers (2025-08-21T04:20:36Z) - SVGen: Interpretable Vector Graphics Generation with Large Language Models [61.62816031675714]
We introduce SVG-1M, a large-scale dataset of high-quality SVGs paired with natural language descriptions.<n>We create well-aligned Text to SVG training pairs, including a subset with Chain of Thought annotations for enhanced semantic guidance.<n>Based on this dataset, we propose SVGen, an end-to-end model that generates SVG code from natural language inputs.
arXiv Detail & Related papers (2025-08-06T15:00:24Z) - Token-Shuffle: Towards High-Resolution Image Generation with Autoregressive Models [92.18057318458528]
Token-Shuffle is a novel method that reduces the number of image tokens in Transformer.<n>Our strategy requires no additional pretrained text-encoder and enables MLLMs to support extremely high-resolution image synthesis.<n>In GenAI-benchmark, our 2.7B model achieves 0.77 overall score on hard prompts, outperforming AR models LlamaGen by 0.18 and diffusion models LDM by 0.15.
arXiv Detail & Related papers (2025-04-24T17:59:56Z) - SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis [66.44553285020066]
SuperSVG is a superpixel-based vectorization model that achieves fast and high-precision image vectorization.
We propose a two-stage self-training framework, where a coarse-stage model is employed to reconstruct the main structure and a refinement-stage model is used for enriching the details.
Experiments demonstrate the superior performance of our method in terms of reconstruction accuracy and inference time compared to state-of-the-art approaches.
arXiv Detail & Related papers (2024-06-14T07:43:23Z) - NP-DRAW: A Non-Parametric Structured Latent Variable Modelfor Image
Generation [139.8037697822064]
We present a non-parametric structured latent variable model for image generation, called NP-DRAW.
It sequentially draws on a latent canvas in a part-by-part fashion and then decodes the image from the canvas.
arXiv Detail & Related papers (2021-06-25T05:17:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.