Related papers: CoSE: Compositional Stroke Embeddings

CoSE: Compositional Stroke Embeddings

URL: http://arxiv.org/abs/2006.09930v2
Date: Mon, 30 Nov 2020 18:50:51 GMT
Title: CoSE: Compositional Stroke Embeddings
Authors: Emre Aksan, Thomas Deselaers, Andrea Tagliasacchi, Otmar Hilliges
Abstract summary: We present a generative model for complex free-form structures such as stroke-based drawing tasks. Our approach is suitable for interactive use cases such as auto-completing diagrams.
Score: 52.529172734044664
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We present a generative model for complex free-form structures such as stroke-based drawing tasks. While previous approaches rely on sequence-based models for drawings of basic objects or handwritten text, we propose a model that treats drawings as a collection of strokes that can be composed into complex structures such as diagrams (e.g., flow-charts). At the core of the approach lies a novel autoencoder that projects variable-length strokes into a latent space of fixed dimension. This representation space allows a relational model, operating in latent space, to better capture the relationship between strokes and to predict subsequent strokes. We demonstrate qualitatively and quantitatively that our proposed approach is able to model the appearance of individual strokes, as well as the compositional structure of larger diagram drawings. Our approach is suitable for interactive use cases such as auto-completing diagrams. We make code and models publicly available at https://eth-ait.github.io/cose.

Related papers

SketchAgent: Generating Structured Diagrams from Hand-Drawn Sketches [54.06877048295693]
We introduce SketchAgent, a system designed to automate the transformation of hand-drawn sketches into structured diagrams.<n>SketchAgent integrates sketch recognition, symbolic reasoning, and iterative validation to produce semantically coherent and structurally accurate diagrams.<n>By streamlining the diagram generation process, SketchAgent holds great promise for applications in design, education, and engineering.
arXiv Detail & Related papers (2025-08-02T07:22:51Z)
StrokeFusion: Vector Sketch Generation via Joint Stroke-UDF Encoding and Latent Sequence Diffusion [13.862427684807486]
StrokeFusion is a two-stage framework for vector sketch generation. It contains a dual-modal sketch feature learning network that maps strokes into a high-quality latent space. It exploits a stroke-level latent diffusion model that simultaneously adjusts stroke position, scale, and trajectory during generation.
arXiv Detail & Related papers (2025-03-31T06:03:03Z)
SwiftSketch: A Diffusion Model for Image-to-Vector Sketch Generation [57.47730473674261]
We introduce SwiftSketch, a model for image-conditioned vector sketch generation that can produce high-quality sketches in less than a second. SwiftSketch operates by progressively denoising stroke control points sampled from a Gaussian distribution. ControlSketch is a method that enhances SDS-based techniques by incorporating precise spatial control through a depth-aware ControlNet.
arXiv Detail & Related papers (2025-02-12T18:57:12Z)
SketchYourSeg: Mask-Free Subjective Image Segmentation via Freehand Sketches [116.1810651297801]
SketchYourSeg establishes freehand sketches as a powerful query modality for subjective image segmentation. Our evaluations demonstrate superior performance over existing approaches across diverse benchmarks.
arXiv Detail & Related papers (2025-01-27T13:07:51Z)
SketchGPT: Autoregressive Modeling for Sketch Generation and Recognition [4.6519578789100215]
SketchGPT is a flexible framework that employs a sequence-to-sequence autoregressive model for sketch generation, and completion. By mapping complex sketches into simplified sequences of abstract primitives, our approach significantly streamlines the input for autoregressive modeling.
arXiv Detail & Related papers (2024-05-06T01:24:14Z)
Sequential Modeling Enables Scalable Learning for Large Vision Models [120.91839619284431]
We introduce a novel sequential modeling approach which enables learning a Large Vision Model (LVM) without making use of any linguistic data. We define a common format, "visual sentences", in which we can represent raw images and videos as well as annotated data sources.
arXiv Detail & Related papers (2023-12-01T18:59:57Z)
Self-Supervised Open-Ended Classification with Small Visual Language Models [60.23212389067007]
We present Self-Context Adaptation (SeCAt), a self-supervised approach that unlocks few-shot abilities for open-ended classification with small visual language models. By using models with approximately 1B parameters we outperform the few-shot abilities of much larger models, such as Frozen and FROMAGe.
arXiv Detail & Related papers (2023-09-30T21:41:21Z)
Enhancing Visually-Rich Document Understanding via Layout Structure Modeling [91.07963806829237]
We propose GraphLM, a novel document understanding model that injects layout knowledge into the model. We evaluate our model on various benchmarks, including FUNSD, XFUND and CORD, and achieve state-of-the-art results.
arXiv Detail & Related papers (2023-08-15T13:53:52Z)
Link Prediction with Attention Applied on Multiple Knowledge Graph Embedding Models [7.620967781722715]
Knowledge graph embeddings map nodes into a vector space to predict new links, scoring them according to geometric criteria. No single model can learn all patterns equally well. In this paper, we combine the query representations from several models in a unified one to incorporate patterns that are independently captured by each model.
arXiv Detail & Related papers (2023-02-13T10:07:26Z)
G-MSM: Unsupervised Multi-Shape Matching with Graph-based Affinity Priors [52.646396621449]
G-MSM is a novel unsupervised learning approach for non-rigid shape correspondence. We construct an affinity graph on a given set of training shapes in a self-supervised manner. We demonstrate state-of-the-art performance on several recent shape correspondence benchmarks.
arXiv Detail & Related papers (2022-12-06T12:09:24Z)
GrannGAN: Graph annotation generative adversarial networks [72.66289932625742]
We consider the problem of modelling high-dimensional distributions and generating new examples of data with complex relational feature structure coherent with a graph skeleton. The model we propose tackles the problem of generating the data features constrained by the specific graph structure of each data point by splitting the task into two phases. In the first it models the distribution of features associated with the nodes of the given graph, in the second it complements the edge features conditionally on the node features.
arXiv Detail & Related papers (2022-12-01T11:49:07Z)
Vitruvion: A Generative Model of Parametric CAD Sketches [22.65229769427499]
We present an approach to generative modeling of parametric CAD sketches. Our model, trained on real-world designs from the SketchGraphs dataset, autoregressively synthesizes sketches as sequences of primitives. We condition the model on various contexts, including partial sketches (primers) and images of hand-drawn sketches.
arXiv Detail & Related papers (2021-09-29T01:02:30Z)
SketchGen: Generating Constrained CAD Sketches [34.26732809515799]
We propose SketchGen as a generative model based on a transformer architecture to address the heterogeneity problem. A highlight of our work is the ability to produce primitives linked via constraints that enables the final output to be further regularized.
arXiv Detail & Related papers (2021-06-04T20:45:03Z)
R2D2: Relational Text Decoding with Transformers [18.137828323277347]
We propose a novel framework for modeling the interaction between graphical structures and the natural language text associated with their nodes and edges. Our proposed method utilizes both the graphical structure as well as the sequential nature of the texts. While the proposed model has wide applications, we demonstrate its capabilities on data-to-text generation tasks.
arXiv Detail & Related papers (2021-05-10T19:59:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.