ACNet: Approaching-and-Centralizing Network for Zero-Shot Sketch-Based
Image Retrieval
- URL: http://arxiv.org/abs/2111.12757v1
- Date: Wed, 24 Nov 2021 19:36:10 GMT
- Title: ACNet: Approaching-and-Centralizing Network for Zero-Shot Sketch-Based
Image Retrieval
- Authors: Hao Ren, Ziqiang Zheng, Yang Wu, Hong Lu, Yang Yang, Sai-Kit Yeung
- Abstract summary: We propose an textbfApproaching-and-textbfCentralizing textbfNetwork (termed textbfACNet'') to jointly optimize sketch-to-photo synthesis and the image retrieval.
The retrieval module guides the synthesis module to generate large amounts of diverse photo-like images which gradually approach the photo domain.
Our approach achieves state-of-the-art performance on two widely used ZS-SBIR datasets and surpasses previous methods by a large margin.
- Score: 28.022137537238425
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The huge domain gap between sketches and photos and the highly abstract
sketch representations pose challenges for sketch-based image retrieval
(\underline{SBIR}). The zero-shot sketch-based image retrieval
(\underline{ZS-SBIR}) is more generic and practical but poses an even greater
challenge because of the additional knowledge gap between the seen and unseen
categories. To simultaneously mitigate both gaps, we propose an
\textbf{A}pproaching-and-\textbf{C}entralizing \textbf{Net}work (termed
``\textbf{ACNet}'') to jointly optimize sketch-to-photo synthesis and the image
retrieval. The retrieval module guides the synthesis module to generate large
amounts of diverse photo-like images which gradually approach the photo domain,
and thus better serve the retrieval module than ever to learn domain-agnostic
representations and category-agnostic common knowledge for generalizing to
unseen categories. These diverse images generated with retrieval guidance can
effectively alleviate the overfitting problem troubling concrete
category-specific training samples with high gradients. We also discover the
use of proxy-based NormSoftmax loss is effective in the zero-shot setting
because its centralizing effect can stabilize our joint training and promote
the generalization ability to unseen categories. Our approach is simple yet
effective, which achieves state-of-the-art performance on two widely used
ZS-SBIR datasets and surpasses previous methods by a large margin.
Related papers
- Symmetrical Bidirectional Knowledge Alignment for Zero-Shot Sketch-Based
Image Retrieval [69.46139774646308]
This paper studies the problem of zero-shot sketch-based image retrieval (ZS-SBIR)
It aims to use sketches from unseen categories as queries to match the images of the same category.
We propose a novel Symmetrical Bidirectional Knowledge Alignment for zero-shot sketch-based image retrieval (SBKA)
arXiv Detail & Related papers (2023-12-16T04:50:34Z) - Coarse-to-Fine: Learning Compact Discriminative Representation for
Single-Stage Image Retrieval [11.696941841000985]
Two-stage methods following retrieve-and-rerank paradigm have achieved excellent performance, but their separate local and global modules are inefficient to real-world applications.
We propose a mechanism which attentively selects prominent local descriptors and infuse fine-grained semantic relations into the global representation.
Our method achieves state-of-the-art single-stage image retrieval performance on benchmarks such as Revisited Oxford and Revisited Paris.
arXiv Detail & Related papers (2023-08-08T03:06:10Z) - Adapt and Align to Improve Zero-Shot Sketch-Based Image Retrieval [85.39613457282107]
Cross-domain nature of sketch-based image retrieval is challenging.
We present an effective Adapt and Align'' approach to address the key challenges.
Inspired by recent advances in image-text foundation models (e.g., CLIP) on zero-shot scenarios, we explicitly align the learned image embedding with a more semantic text embedding to achieve the desired knowledge transfer from seen to unseen classes.
arXiv Detail & Related papers (2023-05-09T03:10:15Z) - Distribution Aligned Feature Clustering for Zero-Shot Sketch-Based Image
Retrieval [18.81230334624234]
This paper tackles the challenges from a new perspective: utilizing gallery image features.
We propose a Cluster-then-Retrieve (ClusterRetri) method that performs clustering on the gallery images and uses the cluster centroids as proxies for retrieval.
Despite its simplicity, our proposed method outperforms the state-of-the-art methods by a large margin on popular datasets.
arXiv Detail & Related papers (2023-01-17T03:58:12Z) - Domain-Smoothing Network for Zero-Shot Sketch-Based Image Retrieval [66.37346493506737]
Zero-Shot Sketch-Based Image Retrieval (ZS-SBIR) is a novel cross-modal retrieval task.
We propose a novel Domain-Smoothing Network (DSN) for ZS-SBIR.
Our approach notably outperforms the state-of-the-art methods in both Sketchy and TU-Berlin datasets.
arXiv Detail & Related papers (2021-06-22T14:58:08Z) - Towards Unsupervised Sketch-based Image Retrieval [126.77787336692802]
We introduce a novel framework that simultaneously performs unsupervised representation learning and sketch-photo domain alignment.
Our framework achieves excellent performance in the new unsupervised setting, and performs comparably or better than state-of-the-art in the zero-shot setting.
arXiv Detail & Related papers (2021-05-18T02:38:22Z) - More Photos are All You Need: Semi-Supervised Learning for Fine-Grained
Sketch Based Image Retrieval [112.1756171062067]
We introduce a novel semi-supervised framework for cross-modal retrieval.
At the centre of our design is a sequential photo-to-sketch generation model.
We also introduce a discriminator guided mechanism to guide against unfaithful generation.
arXiv Detail & Related papers (2021-03-25T17:27:08Z) - The Power of Triply Complementary Priors for Image Compressive Sensing [89.14144796591685]
We propose a joint low-rank deep (LRD) image model, which contains a pair of complementaryly trip priors.
We then propose a novel hybrid plug-and-play framework based on the LRD model for image CS.
To make the optimization tractable, a simple yet effective algorithm is proposed to solve the proposed H-based image CS problem.
arXiv Detail & Related papers (2020-05-16T08:17:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.