Recognizing Vector Graphics without Rasterization
- URL: http://arxiv.org/abs/2111.03281v1
- Date: Fri, 5 Nov 2021 06:16:17 GMT
- Title: Recognizing Vector Graphics without Rasterization
- Authors: Xinyang Jiang, Lu Liu, Caihua Shan, Yifei Shen, Xuanyi Dong, Dongsheng
Li
- Abstract summary: We consider a different data format for images: vector graphics.
In contrast to graphics which are widely used in image recognition, vector graphics can be scaled up or down into any resolution without aliasing or information loss.
YOLaT builds multi-graphs to model the structural and spatial information in vector, and a dual-stream graph neural network is proposed to detect objects from the graph.
- Score: 36.31813939087549
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we consider a different data format for images: vector
graphics. In contrast to raster graphics which are widely used in image
recognition, vector graphics can be scaled up or down into any resolution
without aliasing or information loss, due to the analytic representation of the
primitives in the document. Furthermore, vector graphics are able to give extra
structural information on how low-level elements group together to form high
level shapes or structures. These merits of graphic vectors have not been fully
leveraged in existing methods. To explore this data format, we target on the
fundamental recognition tasks: object localization and classification. We
propose an efficient CNN-free pipeline that does not render the graphic into
pixels (i.e. rasterization), and takes textual document of the vector graphics
as input, called YOLaT (You Only Look at Text). YOLaT builds multi-graphs to
model the structural and spatial information in vector graphics, and a
dual-stream graph neural network is proposed to detect objects from the graph.
Our experiments show that by directly operating on vector graphics, YOLaT
out-performs raster-graphic based object detection baselines in terms of both
average precision and efficiency.
Related papers
- NeuralSVG: An Implicit Representation for Text-to-Vector Generation [54.4153300455889]
We propose NeuralSVG, an implicit neural representation for generating vector graphics from text prompts.
To encourage a layered structure in the generated SVG, we introduce a dropout-based regularization technique.
We demonstrate that NeuralSVG outperforms existing methods in generating structured and flexible SVG.
arXiv Detail & Related papers (2025-01-07T18:50:06Z) - SVGDreamer++: Advancing Editability and Diversity in Text-Guided SVG Generation [31.76771064173087]
We propose a novel text-guided vector graphics synthesis method to address limitations of existing methods.
We introduce a Hierarchical Image VEctorization (HIVE) framework that operates at the semantic object level.
We also present a Vectorized Particle-based Score Distillation (VPSD) approach to improve the diversity of output SVGs.
arXiv Detail & Related papers (2024-11-26T19:13:38Z) - Vector Grimoire: Codebook-based Shape Generation under Raster Image Supervision [20.325246638505714]
We introduce GRIMOIRE, a text-guided generative model that learns to map images onto a discrete codebook by reconstructing them as vector shapes.
Unlike existing models that require direct supervision from data, GRIMOIRE learns using only image supervision which opens up vector generative modeling to significantly more data.
arXiv Detail & Related papers (2024-10-08T12:41:31Z) - SuperSVG: Superpixel-based Scalable Vector Graphics Synthesis [66.44553285020066]
SuperSVG is a superpixel-based vectorization model that achieves fast and high-precision image vectorization.
We propose a two-stage self-training framework, where a coarse-stage model is employed to reconstruct the main structure and a refinement-stage model is used for enriching the details.
Experiments demonstrate the superior performance of our method in terms of reconstruction accuracy and inference time compared to state-of-the-art approaches.
arXiv Detail & Related papers (2024-06-14T07:43:23Z) - Text-Guided Vector Graphics Customization [31.41266632288932]
We propose a novel pipeline that generates high-quality customized vector graphics based on textual prompts.
Our method harnesses the capabilities of large pre-trained text-to-image models.
We evaluate our method using multiple metrics from vector-level, image-level and text-level perspectives.
arXiv Detail & Related papers (2023-09-21T17:59:01Z) - VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models [82.93345261434943]
We show that a text-conditioned diffusion model trained on pixel representations of images can be used to generate SVG-exportable vector graphics.
Inspired by recent text-to-3D work, we learn an SVG consistent with a caption using Score Distillation Sampling.
Experiments show greater quality than prior work, and demonstrate a range of styles including pixel art and sketches.
arXiv Detail & Related papers (2022-11-21T10:04:27Z) - Learning to Generate Scene Graph from Natural Language Supervision [52.18175340725455]
We propose one of the first methods that learn from image-sentence pairs to extract a graphical representation of localized objects and their relationships within an image, known as scene graph.
We leverage an off-the-shelf object detector to identify and localize object instances, match labels of detected regions to concepts parsed from captions, and thus create "pseudo" labels for learning scene graph.
arXiv Detail & Related papers (2021-09-06T03:38:52Z) - Im2Vec: Synthesizing Vector Graphics without Vector Supervision [31.074606918245298]
Vector graphics are widely used to represent fonts, logos, digital artworks, and graphic designs.
One can alwaysize the input graphic and resort to image-based generative approaches.
Current models that require explicit supervision on the vector representation at training time are difficult to obtain.
We propose a new neural network that can generate complex vector graphics with varying topologies.
arXiv Detail & Related papers (2021-02-04T18:39:45Z) - Promoting Graph Awareness in Linearized Graph-to-Text Generation [72.83863719868364]
We study the ability of linearized models to encode local graph structures.
Our findings motivate solutions to enrich the quality of models' implicit graph encodings.
We find that these denoising scaffolds lead to substantial improvements in downstream generation in low-resource settings.
arXiv Detail & Related papers (2020-12-31T18:17:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.