Towards Unsupervised Sketch-based Image Retrieval
- URL: http://arxiv.org/abs/2105.08237v1
- Date: Tue, 18 May 2021 02:38:22 GMT
- Title: Towards Unsupervised Sketch-based Image Retrieval
- Authors: Conghui Hu, Yongxin Yang, Yunpeng Li, Timothy M. Hospedales, Yi-Zhe
Song
- Abstract summary: We introduce a novel framework that simultaneously performs unsupervised representation learning and sketch-photo domain alignment.
Our framework achieves excellent performance in the new unsupervised setting, and performs comparably or better than state-of-the-art in the zero-shot setting.
- Score: 126.77787336692802
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current supervised sketch-based image retrieval (SBIR) methods achieve
excellent performance. However, the cost of data collection and labeling
imposes an intractable barrier to practical deployment of real applications. In
this paper, we present the first attempt at unsupervised SBIR to remove the
labeling cost (category annotations and sketch-photo pairings) that is
conventionally needed for training. Existing single-domain unsupervised
representation learning methods perform poorly in this application, due to the
unique cross-domain (sketch and photo) nature of the problem. We therefore
introduce a novel framework that simultaneously performs unsupervised
representation learning and sketch-photo domain alignment. Technically this is
underpinned by exploiting joint distribution optimal transport (JDOT) to align
data from different domains during representation learning, which we extend
with trainable cluster prototypes and feature memory banks to further improve
scalability and efficacy. Extensive experiments show that our framework
achieves excellent performance in the new unsupervised setting, and performs
comparably or better than state-of-the-art in the zero-shot setting.
Related papers
- Towards Self-Supervised FG-SBIR with Unified Sample Feature Alignment and Multi-Scale Token Recycling [11.129453244307369]
FG-SBIR aims to minimize the distance between sketches and corresponding images in the embedding space.
We propose an effective approach to narrow the gap between the two domains.
It mainly facilitates unified mutual information sharing both intra- and inter-samples.
arXiv Detail & Related papers (2024-06-17T13:49:12Z) - Symmetrical Bidirectional Knowledge Alignment for Zero-Shot Sketch-Based
Image Retrieval [69.46139774646308]
This paper studies the problem of zero-shot sketch-based image retrieval (ZS-SBIR)
It aims to use sketches from unseen categories as queries to match the images of the same category.
We propose a novel Symmetrical Bidirectional Knowledge Alignment for zero-shot sketch-based image retrieval (SBKA)
arXiv Detail & Related papers (2023-12-16T04:50:34Z) - Active Learning for Fine-Grained Sketch-Based Image Retrieval [1.994307489466967]
The ability to retrieve a photo by mere free-hand sketching highlights the immense potential of Fine-grained sketch-based image retrieval (FG-SBIR)
We propose a novel active learning sampling technique that drastically minimises the need for drawing photo sketches.
arXiv Detail & Related papers (2023-09-15T20:07:14Z) - Adapt and Align to Improve Zero-Shot Sketch-Based Image Retrieval [85.39613457282107]
Cross-domain nature of sketch-based image retrieval is challenging.
We present an effective Adapt and Align'' approach to address the key challenges.
Inspired by recent advances in image-text foundation models (e.g., CLIP) on zero-shot scenarios, we explicitly align the learned image embedding with a more semantic text embedding to achieve the desired knowledge transfer from seen to unseen classes.
arXiv Detail & Related papers (2023-05-09T03:10:15Z) - Cross-domain Few-shot Segmentation with Transductive Fine-tuning [29.81009103722184]
We propose to transductively fine-tune the base model on a set of query images under the few-shot setting.
Our method could consistently and significantly improve the performance of prototypical FSS models in all cross-domain tasks.
arXiv Detail & Related papers (2022-11-27T06:44:41Z) - Feature Representation Learning for Unsupervised Cross-domain Image
Retrieval [73.3152060987961]
Current supervised cross-domain image retrieval methods can achieve excellent performance.
The cost of data collection and labeling imposes an intractable barrier to practical deployment in real applications.
We introduce a new cluster-wise contrastive learning mechanism to help extract class semantic-aware features.
arXiv Detail & Related papers (2022-07-20T07:52:14Z) - Masked Unsupervised Self-training for Zero-shot Image Classification [98.23094305347709]
Masked Unsupervised Self-Training (MUST) is a new approach which leverages two different and complimentary sources of supervision: pseudo-labels and raw images.
MUST improves upon CLIP by a large margin and narrows the performance gap between unsupervised and supervised classification.
arXiv Detail & Related papers (2022-06-07T02:03:06Z) - ACNet: Approaching-and-Centralizing Network for Zero-Shot Sketch-Based
Image Retrieval [28.022137537238425]
We propose an textbfApproaching-and-textbfCentralizing textbfNetwork (termed textbfACNet'') to jointly optimize sketch-to-photo synthesis and the image retrieval.
The retrieval module guides the synthesis module to generate large amounts of diverse photo-like images which gradually approach the photo domain.
Our approach achieves state-of-the-art performance on two widely used ZS-SBIR datasets and surpasses previous methods by a large margin.
arXiv Detail & Related papers (2021-11-24T19:36:10Z) - More Photos are All You Need: Semi-Supervised Learning for Fine-Grained
Sketch Based Image Retrieval [112.1756171062067]
We introduce a novel semi-supervised framework for cross-modal retrieval.
At the centre of our design is a sequential photo-to-sketch generation model.
We also introduce a discriminator guided mechanism to guide against unfaithful generation.
arXiv Detail & Related papers (2021-03-25T17:27:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.