Adaptive Fine-Grained Sketch-Based Image Retrieval
- URL: http://arxiv.org/abs/2207.01723v2
- Date: Wed, 6 Jul 2022 15:59:43 GMT
- Title: Adaptive Fine-Grained Sketch-Based Image Retrieval
- Authors: Ayan Kumar Bhunia, Aneeshan Sain, Parth Shah, Animesh Gupta, Pinaki
Nath Chowdhury, Tao Xiang, Yi-Zhe Song
- Abstract summary: Recent focus on Fine-Grained Sketch-Based Image Retrieval has shifted towards generalising a model to new categories.
In real-world applications, a trained FG-SBIR model is often applied to both new categories and different human sketchers.
We introduce a novel model-agnostic meta-learning (MAML) based framework with several key modifications.
- Score: 100.90633284767205
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The recent focus on Fine-Grained Sketch-Based Image Retrieval (FG-SBIR) has
shifted towards generalising a model to new categories without any training
data from them. In real-world applications, however, a trained FG-SBIR model is
often applied to both new categories and different human sketchers, i.e.,
different drawing styles. Although this complicates the generalisation problem,
fortunately, a handful of examples are typically available, enabling the model
to adapt to the new category/style. In this paper, we offer a novel perspective
-- instead of asking for a model that generalises, we advocate for one that
quickly adapts, with just very few samples during testing (in a few-shot
manner). To solve this new problem, we introduce a novel model-agnostic
meta-learning (MAML) based framework with several key modifications: (1) As a
retrieval task with a margin-based contrastive loss, we simplify the MAML
training in the inner loop to make it more stable and tractable. (2) The margin
in our contrastive loss is also meta-learned with the rest of the model. (3)
Three additional regularisation losses are introduced in the outer loop, to
make the meta-learned FG-SBIR model more effective for category/style
adaptation. Extensive experiments on public datasets suggest a large gain over
generalisation and zero-shot based approaches, and a few strong few-shot
baselines.
Related papers
- Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Robust Image Ordinal Regression with Controllable Image Generation [12.787258917766005]
We propose a novel framework called CIG based on controllable image generation.
Our main idea is to generate extra training samples with specific labels near category boundaries.
We evaluate the effectiveness of our new CIG approach in three different image ordinal regression scenarios.
arXiv Detail & Related papers (2023-05-07T08:10:56Z) - Demystifying the Base and Novel Performances for Few-shot
Class-incremental Learning [15.762281194023462]
Few-shot class-incremental learning (FSCIL) has addressed challenging real-world scenarios where unseen novel classes continually arrive with few samples.
It is required to develop a model that recognizes the novel classes without forgetting prior knowledge.
It is shown that our straightforward method has comparable performance with the sophisticated state-of-the-art algorithms.
arXiv Detail & Related papers (2022-06-18T00:39:47Z) - Sketch3T: Test-Time Training for Zero-Shot SBIR [106.59164595640704]
Zero-shot sketch-based image retrieval typically asks for a trained model to be applied as is to unseen categories.
We extend ZS-SBIR asking it to transfer to both categories and sketch distributions.
Our key contribution is a test-time training paradigm that can adapt using just one sketch.
arXiv Detail & Related papers (2022-03-28T12:44:49Z) - Learning What Not to Segment: A New Perspective on Few-Shot Segmentation [63.910211095033596]
Recently few-shot segmentation (FSS) has been extensively developed.
This paper proposes a fresh and straightforward insight to alleviate the problem.
In light of the unique nature of the proposed approach, we also extend it to a more realistic but challenging setting.
arXiv Detail & Related papers (2022-03-15T03:08:27Z) - Few Shot Activity Recognition Using Variational Inference [9.371378627575883]
We propose a novel variational inference based architectural framework (HF-AR) for few shot activity recognition.
Our framework leverages volume-preserving Householder Flow to learn a flexible posterior distribution of the novel classes.
This results in better performance as compared to state-of-the-art few shot approaches for human activity recognition.
arXiv Detail & Related papers (2021-08-20T03:57:58Z) - Few-shot Classification via Adaptive Attention [93.06105498633492]
We propose a novel few-shot learning method via optimizing and fast adapting the query sample representation based on very few reference samples.
As demonstrated experimentally, the proposed model achieves state-of-the-art classification results on various benchmark few-shot classification and fine-grained recognition datasets.
arXiv Detail & Related papers (2020-08-06T05:52:59Z) - Complementing Representation Deficiency in Few-shot Image
Classification: A Meta-Learning Approach [27.350615059290348]
We propose a meta-learning approach with complemented representations network (MCRNet) for few-shot image classification.
In particular, we embed a latent space, where latent codes are reconstructed with extra representation information to complement the representation deficiency.
Our end-to-end framework achieves the state-of-the-art performance in image classification on three standard few-shot learning datasets.
arXiv Detail & Related papers (2020-07-21T13:25:54Z) - FLAT: Few-Shot Learning via Autoencoding Transformation Regularizers [67.46036826589467]
We present a novel regularization mechanism by learning the change of feature representations induced by a distribution of transformations without using the labels of data examples.
It could minimize the risk of overfitting into base categories by inspecting the transformation-augmented variations at the encoded feature level.
Experiment results show the superior performances to the current state-of-the-art methods in literature.
arXiv Detail & Related papers (2019-12-29T15:26:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.