Related papers: CLIPDrawX: Primitive-based Explanations for Text Guided Sketch Synthesis

CLIPDrawX: Primitive-based Explanations for Text Guided Sketch Synthesis

URL: http://arxiv.org/abs/2312.02345v1
Date: Mon, 4 Dec 2023 21:11:42 GMT
Title: CLIPDrawX: Primitive-based Explanations for Text Guided Sketch Synthesis
Authors: Nityanand Mathur, Shyam Marjit, Abhra Chaudhuri, Anjan Dutta
Abstract summary: We show that the latent space of CLIP can be visualized solely in terms of linear transformations on simple geometric primitives like circles and straight lines. We present CLIPDrawX, an algorithm that provides significantly better visualizations for CLIP text embeddings.
Score: 4.025987274016071
License: http://creativecommons.org/licenses/by/4.0/
Abstract: With the goal of understanding the visual concepts that CLIP associates with text prompts, we show that the latent space of CLIP can be visualized solely in terms of linear transformations on simple geometric primitives like circles and straight lines. Although existing approaches achieve this by sketch-synthesis-through-optimization, they do so on the space of B\'ezier curves, which exhibit a wastefully large set of structures that they can evolve into, as most of them are non-essential for generating meaningful sketches. We present CLIPDrawX, an algorithm that provides significantly better visualizations for CLIP text embeddings, using only simple primitive shapes like straight lines and circles. This constrains the set of possible outputs to linear transformations on these primitives, thereby exhibiting an inherently simpler mathematical form. The synthesis process of CLIPDrawX can be tracked end-to-end, with each visual concept being explained exclusively in terms of primitives. Implementation will be released upon acceptance. Project Page: $\href{https://clipdrawx.github.io/}{\text{https://clipdrawx.github.io/}}$.

Related papers

Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation [72.47110803885235]
We introduce a novel framework named Cascade-CLIP for zero-shot semantic segmentation. Our framework achieves superior zero-shot performance on segmentation benchmarks like COCO-Stuff, Pascal-VOC, and Pascal-Context.
arXiv Detail & Related papers (2024-06-02T08:32:51Z)
SketchINR: A First Look into Sketches as Implicit Neural Representations [120.4152701687737]
We propose SketchINR, to advance the representation of vector sketches with implicit neural models. A variable length vector sketch is compressed into a latent space of fixed dimension that implicitly encodes the underlying shape as a function of time and strokes. For the first time, SketchINR emulates the human ability to reproduce a sketch with varying abstraction in terms of number and complexity of strokes.
arXiv Detail & Related papers (2024-03-14T12:49:29Z)
Sketch Video Synthesis [52.134906766625164]
We propose a novel framework for sketching videos represented by the frame-wise B'ezier curve. Our method unlocks applications in sketch-based video editing and video doodling, enabled through video composition.
arXiv Detail & Related papers (2023-11-26T14:14:04Z)
Improving Zero-Shot Generalization for CLIP with Synthesized Prompts [135.4317555866831]
Most existing methods require labeled data for all classes, which may not hold in real-world applications. We propose a plug-and-play generative approach called textbfSynttextbfHestextbfIzed textbfPrompts(textbfSHIP) to improve existing fine-tuning methods.
arXiv Detail & Related papers (2023-07-14T15:15:45Z)
SketchXAI: A First Look at Explainability for Human Sketches [104.13322289903577]
This paper introduces human sketches to the landscape of XAI (Explainable Artificial Intelligence) We argue that sketch as a human-centred'' data form, represents a natural interface to study explainability. We design a sketch encoder that accommodates the intrinsic properties of strokes: shape, location, and order.
arXiv Detail & Related papers (2023-04-23T20:28:38Z)
Linking Sketch Patches by Learning Synonymous Proximity for Graphic Sketch Representation [8.19063619210761]
We propose an order-invariant, semantics-aware method for graphic sketch representations. The cropped sketch patches are linked according to their global semantics or local geometric shapes, namely the synonymous proximity. We show that our method significantly improves the performance on both controllable sketch synthesis and sketch healing.
arXiv Detail & Related papers (2022-11-30T09:28:15Z)
Abstracting Sketches through Simple Primitives [53.04827416243121]
Humans show high-level of abstraction capabilities in games that require quickly communicating object information. We propose the Primitive-based Sketch Abstraction task where the goal is to represent sketches using a fixed set of drawing primitives. Our Primitive-Matching Network (PMN), learns interpretable abstractions of a sketch in a self supervised manner.
arXiv Detail & Related papers (2022-07-27T14:32:39Z)
CLIPasso: Semantically-Aware Object Sketching [34.53644912236454]
We present an object sketching method that can achieve different levels of abstraction, guided by geometric and semantic simplifications. We define a sketch as a set of B'ezier curves and use a differentiizer to optimize the parameters of the curves directly with respect to a CLIP-based perceptual loss.
arXiv Detail & Related papers (2022-02-11T18:35:25Z)
CLIPDraw: Exploring Text-to-Drawing Synthesis through Language-Image Encoders [0.7734726150561088]
CLIPDraw is an algorithm that synthesizes novel drawings based on natural language input. It operates over vector strokes rather than pixel images, a constraint that biases drawings towards simpler human-recognizable shapes. Results compare between CLIPDraw and other synthesis-through-optimization methods.
arXiv Detail & Related papers (2021-06-28T16:43:26Z)
SketchGen: Generating Constrained CAD Sketches [34.26732809515799]
We propose SketchGen as a generative model based on a transformer architecture to address the heterogeneity problem. A highlight of our work is the ability to produce primitives linked via constraints that enables the final output to be further regularized.
arXiv Detail & Related papers (2021-06-04T20:45:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.