Domain-Smoothing Network for Zero-Shot Sketch-Based Image Retrieval
- URL: http://arxiv.org/abs/2106.11841v1
- Date: Tue, 22 Jun 2021 14:58:08 GMT
- Title: Domain-Smoothing Network for Zero-Shot Sketch-Based Image Retrieval
- Authors: Zhipeng Wang, Hao Wang, Jiexi Yan, Aming Wu, Cheng Deng
- Abstract summary: Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a novel cross-modal retrieval task.
We propose a novel Domain-Smoothing Network (DSN) for ZS-SBIR.
Our approach notably outperforms the state-of-the-art methods in both Sketchy and TU-Berlin datasets.
- Score: 66.37346493506737
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a novel cross-modal
retrieval task, where abstract sketches are used as queries to retrieve natural
images under zero-shot scenario. Most existing methods regard ZS-SBIR as a
traditional classification problem and employ a cross-entropy or triplet-based
loss to achieve retrieval, which neglect the problems of the domain gap between
sketches and natural images and the large intra-class diversity in sketches.
Toward this end, we propose a novel Domain-Smoothing Network (DSN) for ZS-SBIR.
Specifically, a cross-modal contrastive method is proposed to learn generalized
representations to smooth the domain gap by mining relations with additional
augmented samples. Furthermore, a category-specific memory bank with sketch
features is explored to reduce intra-class diversity in the sketch domain.
Extensive experiments demonstrate that our approach notably outperforms the
state-of-the-art methods in both Sketchy and TU-Berlin datasets. Our source
code is publicly available at https://github.com/haowang1992/DSN.
Related papers
- Symmetrical Bidirectional Knowledge Alignment for Zero-Shot Sketch-Based
Image Retrieval [69.46139774646308]
This paper studies the problem of zero-shot sketch-based image retrieval (ZS-SBIR)
It aims to use sketches from unseen categories as queries to match the images of the same category.
We propose a novel Symmetrical Bidirectional Knowledge Alignment for zero-shot sketch-based image retrieval (SBKA)
arXiv Detail & Related papers (2023-12-16T04:50:34Z) - Adapt and Align to Improve Zero-Shot Sketch-Based Image Retrieval [85.39613457282107]
Cross-domain nature of sketch-based image retrieval is challenging.
We present an effective Adapt and Align'' approach to address the key challenges.
Inspired by recent advances in image-text foundation models (e.g., CLIP) on zero-shot scenarios, we explicitly align the learned image embedding with a more semantic text embedding to achieve the desired knowledge transfer from seen to unseen classes.
arXiv Detail & Related papers (2023-05-09T03:10:15Z) - WAD-CMSN: Wasserstein Distance based Cross-Modal Semantic Network for
Zero-Shot Sketch-Based Image Retrieval [1.4180331276028657]
Zero-shot sketch-based image retrieval (ZSSBIR) is a popular studied branch of computer vision.
We propose a Wasserstein distance based cross-modal semantic network (WAD-CMSN) for ZSSBIR.
arXiv Detail & Related papers (2022-02-11T05:56:30Z) - Zero-Shot Sketch Based Image Retrieval using Graph Transformer [18.00165431469872]
We propose a novel graph transformer based zero-shot sketch-based image retrieval (GTZSR) framework for solving ZS-SBIR tasks.
To bridge the domain gap between the visual features, we propose minimizing the Wasserstein distance between images and sketches in a learned domain-shared space.
We also propose a novel compatibility loss that further aligns the two visual domains by bridging the domain gap of one class with respect to the domain gap of all other classes in the training set.
arXiv Detail & Related papers (2022-01-25T09:02:39Z) - BDA-SketRet: Bi-Level Domain Adaptation for Zero-Shot SBIR [52.78253400327191]
BDA-SketRet is a novel framework performing a bi-level domain adaptation for aligning the spatial and semantic features of the visual data pairs.
Experimental results on the extended Sketchy, TU-Berlin, and QuickDraw exhibit sharp improvements over the literature.
arXiv Detail & Related papers (2022-01-17T18:45:55Z) - ACNet: Approaching-and-Centralizing Network for Zero-Shot Sketch-Based
Image Retrieval [28.022137537238425]
We propose an textbfApproaching-and-textbfCentralizing textbfNetwork (termed textbfACNet'') to jointly optimize sketch-to-photo synthesis and the image retrieval.
The retrieval module guides the synthesis module to generate large amounts of diverse photo-like images which gradually approach the photo domain.
Our approach achieves state-of-the-art performance on two widely used ZS-SBIR datasets and surpasses previous methods by a large margin.
arXiv Detail & Related papers (2021-11-24T19:36:10Z) - CrossATNet - A Novel Cross-Attention Based Framework for Sketch-Based
Image Retrieval [30.249581102239645]
We propose a novel framework for cross-modal zero-shot learning (ZSL) in the context of sketch-based image retrieval (SBIR)
While we define a cross-modal triplet loss to ensure the discriminative nature of the shared space, an innovative cross-modal attention learning strategy is also proposed to guide feature extraction from the image domain.
arXiv Detail & Related papers (2021-04-20T12:11:12Z) - Semantically Tied Paired Cycle Consistency for Any-Shot Sketch-based
Image Retrieval [55.29233996427243]
Low-shot sketch-based image retrieval is an emerging task in computer vision.
In this paper, we address any-shot, i.e. zero-shot and few-shot, sketch-based image retrieval (SBIR) tasks.
For solving these tasks, we propose a semantically aligned cycle-consistent generative adversarial network (SEM-PCYC)
Our results demonstrate a significant boost in any-shot performance over the state-of-the-art on the extended version of the Sketchy, TU-Berlin and QuickDraw datasets.
arXiv Detail & Related papers (2020-06-20T22:43:53Z) - Sketch Less for More: On-the-Fly Fine-Grained Sketch Based Image
Retrieval [203.2520862597357]
Fine-grained sketch-based image retrieval (FG-SBIR) addresses the problem of retrieving a particular photo instance given a user's query sketch.
We reformulate the conventional FG-SBIR framework to tackle these challenges.
We propose an on-the-fly design that starts retrieving as soon as the user starts drawing.
arXiv Detail & Related papers (2020-02-24T15:36:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.