On Learning Semantic Representations for Million-Scale Free-Hand
Sketches
- URL: http://arxiv.org/abs/2007.04101v1
- Date: Tue, 7 Jul 2020 15:23:22 GMT
- Title: On Learning Semantic Representations for Million-Scale Free-Hand
Sketches
- Authors: Peng Xu, Yongye Huang, Tongtong Yuan, Tao Xiang, Timothy M.
Hospedales, Yi-Zhe Song, Liang Wang
- Abstract summary: We study learning semantic representations for million-scale free-hand sketches.
We propose a dual-branch CNNRNN network architecture to represent sketches.
We explore learning the sketch-oriented semantic representations in hashing retrieval and zero-shot recognition.
- Score: 146.52892067335128
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study learning semantic representations for million-scale
free-hand sketches. This is highly challenging due to the domain-unique traits
of sketches, e.g., diverse, sparse, abstract, noisy. We propose a dual-branch
CNNRNN network architecture to represent sketches, which simultaneously encodes
both the static and temporal patterns of sketch strokes. Based on this
architecture, we further explore learning the sketch-oriented semantic
representations in two challenging yet practical settings, i.e., hashing
retrieval and zero-shot recognition on million-scale sketches. Specifically, we
use our dual-branch architecture as a universal representation framework to
design two sketch-specific deep models: (i) We propose a deep hashing model for
sketch retrieval, where a novel hashing loss is specifically designed to
accommodate both the abstract and messy traits of sketches. (ii) We propose a
deep embedding model for sketch zero-shot recognition, via collecting a
large-scale edge-map dataset and proposing to extract a set of semantic vectors
from edge-maps as the semantic knowledge for sketch zero-shot domain alignment.
Both deep models are evaluated by comprehensive experiments on million-scale
sketches and outperform the state-of-the-art competitors.
Related papers
- SketchTriplet: Self-Supervised Scenarized Sketch-Text-Image Triplet Generation [6.39528707908268]
There continues to be a lack of large-scale paired datasets for scene sketches.
We propose a self-supervised method for scene sketch generation that does not rely on any existing scene sketch.
We contribute a large-scale dataset centered around scene sketches, comprising highly semantically consistent "text-sketch-image" triplets.
arXiv Detail & Related papers (2024-05-29T06:43:49Z) - Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings [99.9788496281408]
We study how sketches can be used as a weak label to detect salient objects present in an image.
To accomplish this, we introduce a photo-to-sketch generation model that aims to generate sequential sketch coordinates corresponding to a given visual photo.
Tests prove our hypothesis and delineate how our sketch-based saliency detection model gives a competitive performance compared to the state-of-the-art.
arXiv Detail & Related papers (2023-03-20T23:46:46Z) - Semantics-Preserving Sketch Embedding for Face Generation [26.15479367792076]
We introduce a novel W-W+ encoder architecture to take advantage of the high expressive power of W+ space.
We also introduce an explicit intermediate representation for sketch semantic embedding.
A novel sketch semantic interpretation approach is designed to automatically extract semantics from vectorized sketches.
arXiv Detail & Related papers (2022-11-23T15:14:49Z) - I Know What You Draw: Learning Grasp Detection Conditioned on a Few
Freehand Sketches [74.63313641583602]
We propose a method to generate a potential grasp configuration relevant to the sketch-depicted objects.
Our model is trained and tested in an end-to-end manner which is easy to be implemented in real-world applications.
arXiv Detail & Related papers (2022-05-09T04:23:36Z) - FS-COCO: Towards Understanding of Freehand Sketches of Common Objects in
Context [112.07988211268612]
We advance sketch research to scenes with the first dataset of freehand scene sketches, FS-COCO.
Our dataset comprises 10,000 freehand scene vector sketches with per point space-time information by 100 non-expert individuals.
We study for the first time the problem of the fine-grained image retrieval from freehand scene sketches and sketch captions.
arXiv Detail & Related papers (2022-03-04T03:00:51Z) - Deep Self-Supervised Representation Learning for Free-Hand Sketch [51.101565480583304]
We tackle the problem of self-supervised representation learning for free-hand sketches.
Key for the success of our self-supervised learning paradigm lies with our sketch-specific designs.
We show that the proposed approach outperforms the state-of-the-art unsupervised representation learning methods.
arXiv Detail & Related papers (2020-02-03T16:28:29Z) - SketchDesc: Learning Local Sketch Descriptors for Multi-view
Correspondence [68.63311821718416]
We study the problem of multi-view sketch correspondence, where we take as input multiple freehand sketches with different views of the same object.
This problem is challenging since the visual features of corresponding points at different views can be very different.
We take a deep learning approach and learn a novel local sketch descriptor from data.
arXiv Detail & Related papers (2020-01-16T11:31:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.