More Photos are All You Need: Semi-Supervised Learning for Fine-Grained
Sketch Based Image Retrieval
- URL: http://arxiv.org/abs/2103.13990v1
- Date: Thu, 25 Mar 2021 17:27:08 GMT
- Title: More Photos are All You Need: Semi-Supervised Learning for Fine-Grained
Sketch Based Image Retrieval
- Authors: Ayan Kumar Bhunia, Pinaki Nath Chowdhury, Aneeshan Sain, Yongxin Yang,
Tao Xiang, Yi-Zhe Song
- Abstract summary: We introduce a novel semi-supervised framework for cross-modal retrieval.
At the centre of our design is a sequential photo-to-sketch generation model.
We also introduce a discriminator guided mechanism to guide against unfaithful generation.
- Score: 112.1756171062067
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A fundamental challenge faced by existing Fine-Grained Sketch-Based Image
Retrieval (FG-SBIR) models is the data scarcity -- model performances are
largely bottlenecked by the lack of sketch-photo pairs. Whilst the number of
photos can be easily scaled, each corresponding sketch still needs to be
individually produced. In this paper, we aim to mitigate such an upper-bound on
sketch data, and study whether unlabelled photos alone (of which they are many)
can be cultivated for performances gain. In particular, we introduce a novel
semi-supervised framework for cross-modal retrieval that can additionally
leverage large-scale unlabelled photos to account for data scarcity. At the
centre of our semi-supervision design is a sequential photo-to-sketch
generation model that aims to generate paired sketches for unlabelled photos.
Importantly, we further introduce a discriminator guided mechanism to guide
against unfaithful generation, together with a distillation loss based
regularizer to provide tolerance against noisy training samples. Last but not
least, we treat generation and retrieval as two conjugate problems, where a
joint learning procedure is devised for each module to mutually benefit from
each other. Extensive experiments show that our semi-supervised model yields
significant performance boost over the state-of-the-art supervised
alternatives, as well as existing methods that can exploit unlabelled photos
for FG-SBIR.
Related papers
- Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Modality-Aware Representation Learning for Zero-shot Sketch-based Image
Retrieval [10.568851068989973]
Zero-shot learning offers an efficient solution for a machine learning model to treat unseen categories.
We propose a novel framework that indirectly aligns sketches and photos by contrasting them through texts.
With an explicit modality encoding learned from data, our approach disentangles modality-agnostic semantics from modality-specific information.
arXiv Detail & Related papers (2024-01-10T00:39:03Z) - Symmetrical Bidirectional Knowledge Alignment for Zero-Shot Sketch-Based
Image Retrieval [69.46139774646308]
This paper studies the problem of zero-shot sketch-based image retrieval (ZS-SBIR)
It aims to use sketches from unseen categories as queries to match the images of the same category.
We propose a novel Symmetrical Bidirectional Knowledge Alignment for zero-shot sketch-based image retrieval (SBKA)
arXiv Detail & Related papers (2023-12-16T04:50:34Z) - Multi-View Unsupervised Image Generation with Cross Attention Guidance [23.07929124170851]
This paper introduces a novel pipeline for unsupervised training of a pose-conditioned diffusion model on single-category datasets.
We identify object poses by clustering the dataset through comparing visibility and locations of specific object parts.
Our model, MIRAGE, surpasses prior work in novel view synthesis on real images.
arXiv Detail & Related papers (2023-12-07T14:55:13Z) - Active Learning for Fine-Grained Sketch-Based Image Retrieval [1.994307489466967]
The ability to retrieve a photo by mere free-hand sketching highlights the immense potential of Fine-grained sketch-based image retrieval (FG-SBIR)
We propose a novel active learning sampling technique that drastically minimises the need for drawing photo sketches.
arXiv Detail & Related papers (2023-09-15T20:07:14Z) - Exploiting Unlabelled Photos for Stronger Fine-Grained SBIR [103.51937218213774]
This paper advances the fine-grained sketch-based image retrieval (FG-SBIR) literature by putting forward a strong baseline that overshoots prior state-of-the-arts by 11%.
We propose a simple modification to the standard triplet loss, that explicitly enforces separation amongst photos/sketch instances.
For (i) we employ an intra-modal triplet loss amongst sketches to bring sketches of the same instance closer from others, and one more amongst photos to push away different photo instances.
arXiv Detail & Related papers (2023-03-24T03:34:33Z) - Data-Free Sketch-Based Image Retrieval [56.96186184599313]
We propose Data-Free (DF)-SBIR, where pre-trained, single-modality classification models have to be leveraged to learn cross-modal metric-space for retrieval without access to any training data.
We present a methodology for DF-SBIR, which can leverage knowledge from models independently trained to perform classification on photos and sketches.
Our method also achieves mAPs competitive with data-dependent approaches, all the while requiring no training data.
arXiv Detail & Related papers (2023-03-14T10:34:07Z) - ACNet: Approaching-and-Centralizing Network for Zero-Shot Sketch-Based
Image Retrieval [28.022137537238425]
We propose an textbfApproaching-and-textbfCentralizing textbfNetwork (termed textbfACNet'') to jointly optimize sketch-to-photo synthesis and the image retrieval.
The retrieval module guides the synthesis module to generate large amounts of diverse photo-like images which gradually approach the photo domain.
Our approach achieves state-of-the-art performance on two widely used ZS-SBIR datasets and surpasses previous methods by a large margin.
arXiv Detail & Related papers (2021-11-24T19:36:10Z) - Towards Unsupervised Sketch-based Image Retrieval [126.77787336692802]
We introduce a novel framework that simultaneously performs unsupervised representation learning and sketch-photo domain alignment.
Our framework achieves excellent performance in the new unsupervised setting, and performs comparably or better than state-of-the-art in the zero-shot setting.
arXiv Detail & Related papers (2021-05-18T02:38:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.