Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image
Retrieval
- URL: http://arxiv.org/abs/2007.15103v2
- Date: Tue, 11 Aug 2020 17:14:49 GMT
- Title: Cross-Modal Hierarchical Modelling for Fine-Grained Sketch Based Image
Retrieval
- Authors: Aneeshan Sain, Ayan Kumar Bhunia, Yongxin Yang, Tao Xiang, Yi-Zhe Song
- Abstract summary: We study a further trait of sketches that has been overlooked to date, that is, they are hierarchical in terms of the levels of detail.
In this paper, we design a novel network that is capable of cultivating sketch-specific hierarchies and exploiting them to match sketch with photo at corresponding hierarchical levels.
- Score: 147.24102408745247
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sketch as an image search query is an ideal alternative to text in capturing
the fine-grained visual details. Prior successes on fine-grained sketch-based
image retrieval (FG-SBIR) have demonstrated the importance of tackling the
unique traits of sketches as opposed to photos, e.g., temporal vs. static,
strokes vs. pixels, and abstract vs. pixel-perfect. In this paper, we study a
further trait of sketches that has been overlooked to date, that is, they are
hierarchical in terms of the levels of detail -- a person typically sketches up
to various extents of detail to depict an object. This hierarchical structure
is often visually distinct. In this paper, we design a novel network that is
capable of cultivating sketch-specific hierarchies and exploiting them to match
sketch with photo at corresponding hierarchical levels. In particular, features
from a sketch and a photo are enriched using cross-modal co-attention, coupled
with hierarchical node fusion at every level to form a better embedding space
to conduct retrieval. Experiments on common benchmarks show our method to
outperform state-of-the-arts by a significant margin.
Related papers
- Stylized Face Sketch Extraction via Generative Prior with Limited Data [6.727433982111717]
StyleSketch is a method for extracting high-resolution stylized sketches from a face image.
Using the rich semantics of the deep features from a pretrained StyleGAN, we are able to train a sketch generator with 16 pairs of face and the corresponding sketch images.
arXiv Detail & Related papers (2024-03-17T16:25:25Z) - Sketch2Saliency: Learning to Detect Salient Objects from Human Drawings [99.9788496281408]
We study how sketches can be used as a weak label to detect salient objects present in an image.
To accomplish this, we introduce a photo-to-sketch generation model that aims to generate sequential sketch coordinates corresponding to a given visual photo.
Tests prove our hypothesis and delineate how our sketch-based saliency detection model gives a competitive performance compared to the state-of-the-art.
arXiv Detail & Related papers (2023-03-20T23:46:46Z) - Abstracting Sketches through Simple Primitives [53.04827416243121]
Humans show high-level of abstraction capabilities in games that require quickly communicating object information.
We propose the Primitive-based Sketch Abstraction task where the goal is to represent sketches using a fixed set of drawing primitives.
Our Primitive-Matching Network (PMN), learns interpretable abstractions of a sketch in a self supervised manner.
arXiv Detail & Related papers (2022-07-27T14:32:39Z) - Three-Stream Joint Network for Zero-Shot Sketch-Based Image Retrieval [15.191262439963221]
The Zero-Shot Sketch-based Image Retrieval (ZS-SBIR) is a challenging task because of the large domain gap between sketches and natural images.
We propose a novel Three-Stream Joint Training Network (3 JOIN) for the ZS-SBIR task.
arXiv Detail & Related papers (2022-04-12T09:52:17Z) - Multi-granularity Association Learning Framework for on-the-fly
Fine-Grained Sketch-based Image Retrieval [7.797006835701767]
Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo in a given query sketch.
In this study, we aim to retrieve the target photo with the least number of strokes possible (incomplete sketch)
We propose a multi-granularity association learning framework that further optimize the embedding space of all incomplete sketches.
arXiv Detail & Related papers (2022-01-13T14:38:50Z) - DeepFacePencil: Creating Face Images from Freehand Sketches [77.00929179469559]
Existing image-to-image translation methods require a large-scale dataset of paired sketches and images for supervision.
We propose DeepFacePencil, an effective tool that is able to generate photo-realistic face images from hand-drawn sketches.
arXiv Detail & Related papers (2020-08-31T03:35:21Z) - Semantically Tied Paired Cycle Consistency for Any-Shot Sketch-based
Image Retrieval [55.29233996427243]
Low-shot sketch-based image retrieval is an emerging task in computer vision.
In this paper, we address any-shot, i.e. zero-shot and few-shot, sketch-based image retrieval (SBIR) tasks.
For solving these tasks, we propose a semantically aligned cycle-consistent generative adversarial network (SEM-PCYC)
Our results demonstrate a significant boost in any-shot performance over the state-of-the-art on the extended version of the Sketchy, TU-Berlin and QuickDraw datasets.
arXiv Detail & Related papers (2020-06-20T22:43:53Z) - Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image
Retrieval [203.2520862597357]
Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo instance given a user's query sketch.
We reformulate the conventional FG-SBIR framework to tackle these challenges.
We propose an on-the-fly design that starts retrieving as soon as the user starts drawing.
arXiv Detail & Related papers (2020-02-24T15:36:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.