Learning Geometry-aware Representations by Sketching
- URL: http://arxiv.org/abs/2304.08204v1
- Date: Mon, 17 Apr 2023 12:23:32 GMT
- Title: Learning Geometry-aware Representations by Sketching
- Authors: Hyundo Lee, Inwoo Hwang, Hyunsung Go, Won-Seok Choi, Kibeom Kim,
Byoung-Tak Zhang
- Abstract summary: We propose learning to represent a scene by sketching, inspired by human behavior.
Our method, coined Learning by Sketching (LBS), learns to convert an image into a set of colored strokes that explicitly incorporate the geometric information of the scene.
- Score: 20.957964436294873
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Understanding geometric concepts, such as distance and shape, is essential
for understanding the real world and also for many vision tasks. To incorporate
such information into a visual representation of a scene, we propose learning
to represent the scene by sketching, inspired by human behavior. Our method,
coined Learning by Sketching (LBS), learns to convert an image into a set of
colored strokes that explicitly incorporate the geometric information of the
scene in a single inference step without requiring a sketch dataset. A sketch
is then generated from the strokes where CLIP-based perceptual loss maintains a
semantic similarity between the sketch and the image. We show theoretically
that sketching is equivariant with respect to arbitrary affine transformations
and thus provably preserves geometric information. Experimental results show
that LBS substantially improves the performance of object attribute
classification on the unlabeled CLEVR dataset, domain transfer between CLEVR
and STL-10 datasets, and for diverse downstream tasks, confirming that LBS
provides rich geometric information.
Related papers
- DiffFaceSketch: High-Fidelity Face Image Synthesis with Sketch-Guided
Latent Diffusion Model [8.1818090854822]
We introduce a Sketch-Guided Latent Diffusion Model (SGLDM), an LDM-based network architect trained on a paired sketch-face dataset.
SGLDM can synthesize high-quality face images with different expressions, facial accessories, and hairstyles from various sketches with different abstraction levels.
arXiv Detail & Related papers (2023-02-14T08:51:47Z) - SSR-GNNs: Stroke-based Sketch Representation with Graph Neural Networks [34.759306840182205]
This paper investigates a graph representation for sketches, where the information of strokes, i.e., parts of a sketch, are encoded on vertices and information of inter-stroke on edges.
The resultant graph representation facilitates the training of a Graph Neural Networks for classification tasks.
The proposed representation enables generation of novel sketches that are structurally similar to while separable from the existing dataset.
arXiv Detail & Related papers (2022-04-27T19:18:01Z) - Self-Supervised Image Representation Learning with Geometric Set
Consistency [50.12720780102395]
We propose a method for self-supervised image representation learning under the guidance of 3D geometric consistency.
Specifically, we introduce 3D geometric consistency into a contrastive learning framework to enforce the feature consistency within image views.
arXiv Detail & Related papers (2022-03-29T08:57:33Z) - FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in
Context [112.07988211268612]
We advance sketch research to scenes with the first dataset of freehand scene sketches, FS-COCO.
Our dataset comprises 10,000 freehand scene vector sketches with per point space-time information by 100 non-expert individuals.
We study for the first time the problem of the fine-grained image retrieval from freehand scene sketches and sketch captions.
arXiv Detail & Related papers (2022-03-04T03:00:51Z) - CLIPasso: Semantically-Aware Object Sketching [34.53644912236454]
We present an object sketching method that can achieve different levels of abstraction, guided by geometric and semantic simplifications.
We define a sketch as a set of B'ezier curves and use a differentiizer to optimize the parameters of the curves directly with respect to a CLIP-based perceptual loss.
arXiv Detail & Related papers (2022-02-11T18:35:25Z) - One Sketch for All: One-Shot Personalized Sketch Segmentation [84.45203849671003]
We present the first one-shot personalized sketch segmentation method.
We aim to segment all sketches belonging to the same category with a single sketch with a given part annotation.
We preserve the parts semantics embedded in the exemplar, and we are robust to input style and abstraction.
arXiv Detail & Related papers (2021-12-20T20:10:44Z) - Semantically Tied Paired Cycle Consistency for Any-Shot Sketch-based
Image Retrieval [55.29233996427243]
Low-shot sketch-based image retrieval is an emerging task in computer vision.
In this paper, we address any-shot, i.e. zero-shot and few-shot, sketch-based image retrieval (SBIR) tasks.
For solving these tasks, we propose a semantically aligned cycle-consistent generative adversarial network (SEM-PCYC)
Our results demonstrate a significant boost in any-shot performance over the state-of-the-art on the extended version of the Sketchy, TU-Berlin and QuickDraw datasets.
arXiv Detail & Related papers (2020-06-20T22:43:53Z) - Sketch-BERT: Learning Sketch Bidirectional Encoder Representation from
Transformers by Self-supervised Learning of Sketch Gestalt [125.17887147597567]
We present a model of learning Sketch BiBERT Representation from Transformer (Sketch-BERT)
We generalize BERT to sketch domain, with the novel proposed components and pre-training algorithms.
We show that the learned representation of Sketch-BERT can help and improve the performance of the downstream tasks of sketch recognition, sketch retrieval, and sketch gestalt.
arXiv Detail & Related papers (2020-05-19T01:35:44Z) - SketchDesc: Learning Local Sketch Descriptors for Multi-view
Correspondence [68.63311821718416]
We study the problem of multi-view sketch correspondence, where we take as input multiple freehand sketches with different views of the same object.
This problem is challenging since the visual features of corresponding points at different views can be very different.
We take a deep learning approach and learn a novel local sketch descriptor from data.
arXiv Detail & Related papers (2020-01-16T11:31:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.